This invention relates to a video effects system, and more particularly to a system that is able to intersect a 3-D image with a 2-D image.
Digital video effects (DVE) devices, such as the ADO manufactured by Ampex Corporation of Palo Alto, California and the Kaleidoscope manufactured by The Grass Valley Group, Inc. of Grass Valley, California, have the ability to transform a television image. A basic technique for doing this, as disclosed in U.S. Pat. No. 4,875,097 issued Oct. 17, 1989 to Richard A. Jackson entitled "Perspective Processing of a Video Signal", is to take each two-dimensional image and assign to each pixel three coordinates (X,Y,Z). The X and Y coordinates refer to the pixel's position on a flat surface, and the Z coordinate refers to the distance from a plane that contains the viewer's eye. The Z coordinate is initially set to some constant since the viewer is looking straight at a flat image.
The DVE device then does a three-dimensional transformation on the coordinates of each pixel, such as disclosed in U.S. Pat. No. 4,797,836 issued Jan. 10, 1989 to Francis A. Witek and David E. Lake, Jr. entitled "Image Orientation and Animation Using Quaternions". Such a transformation is the mathematical equivalent of taking the flat more of the three coordinate axes, and then moving in space. This transformation process takes the original X, Y and Z coordinates and maps them into a new set of coordinates X', Y' and Z'. The transformed image is then projected into the X-Y plane. Current DVE devices implement transformation algorithms using hardware and are able to work in real time, and the particular transform that is effected depends on an operator control, typically a joystick. Therefore, as the joystick is moved, the transform changes and the image can be rotated and translated.
FIG. 1 illustrates in simplified block form a digital video effects system by which two-dimensional images can be transformed and combined. The FIG. 1 device comprises sources 2A, 2B of first and second analog video signals, VideoA and VideoB respectively. Each video signal represents a two-dimensional image that is characterized by a distribution of color information over a two-dimensional display plane. The video signals are accompanied by respective key signals KeyA and KeyB, which are derived from key sources 8A, 8B and represent opacity of the respective images as a function of location over the display plane. The video signals and the accompanying key signals are applied to converters 12A, 12B, which convert the input video signals and key signals into digital form. Each converter 12 may convert its input video signal into a digital signal that is compatible with a standard digital television format, such as the CCIR.601 format.
A digital video signal in the CCIR.601 standard is composed of four-word packets. Each packet contains two luminance slots, Y, and two chrominance slots, C1, C2, multiplexed in the sequence C1 Y C2 Y. The words are up to ten bits each and the data rate is 27 million words per second. Thus, the luminance component of the video signal is digitized at 13.5 MHz and each chrominance component is digitized at half that rate. A signal in accordance with the CCIR.601 standard is informally referred to as a 4:2:2 signal.
Each converter 12 has four output channels, namely a luminance channel, two chroma channels and a key channel. The luminance channel carries luminance information in ten bit words at 13.5 MHz, the first chroma channel carries information pertaining to one of the chroma components in ten bit words at 6.75 MHz, the second chroma channel carries information pertaining to the other chroma component in ten bit words at 6.75 MHz, and the key channel carries key information in ten bit words at 13.5 MHz. The three video channels may, if necessary, be multiplexed to provide a 4:2:2 signal.
The four output channels of converter 12 are applied to a transform section 14 where the input video signal, representing color as a function of position in the display plane, is manipulated in three-dimensions to simulate transformation (translation and/or rotation of the two-dimensional image) of the image in a three-dimensional space and projection of the transformed 2-D image back into the display plane. The transformation may be effected by loading the values of Y, C1 and C2 into a frame buffer at addresses that depend on time relative to the sync signals of the input video signal and reading the values out at different times relative to the sync signals of the output video signal, whereby a quartet of values C1,Y,C2,Y is shifted to a different location in the raster. The key signal is transformed in similar manner. The nature of the transform can be selectively altered in real time by use of a joystick 15.
In the transformation operation, values of depth (Z) relative to the display plane are calculated to twenty bits.
Each transform section has six digital output channels. Three output channels carry a digital video signal VideoA' or VideoB', representing the projection of the transformed two-dimensional image into the display plane, in the same form as the digital input video signal. The fourth channel carries the transformed key signal KeyA' or KeyB' in the same form as the input key signal. The twenty-bit depth words are each divided into two ten-bit components Z1 and Z2, which are carried by the fifth and sixth channels respectively at 6.75 MHz. The key and depth channels may, if necessary, be multiplexed to provide a signal similar to a 4:2:2 video signal.
The output signals of the two transform sections 14A, 14B are input to a depth combiner 16. Combiner 16 combines the transformed video signals VideoA' and VideoB' on the basis of the respective key and depth signals and generates a combined output video signal VideoC. Combiner 16 also combines the transformed key signals KeyA' and KeyB' using well understood rules and generates a key signal KeyC, and generates a depth signal Z.sub.C whose value is equal to the smaller of Z.sub.A ' and Z.sub.B '. Combiner 16 includes a multiplexer that multiplexes the luminance and chroma information of signal VideoC such that the output video signal is in accordance with CCIR.601. Combiner 16 also includes a multiplexer that multiplexes the key and depth signals and provides a signal in a form similar to the CCIR.601 format, each packet containing two ten-bit words of key and one twenty-bit word of depth in two parts of ten bits each.
By selection of the transformations that are performed on the two video images, each transformed image may have pixels that map to the same X' and Y' coordinates, so that the two images intersect.
FIG. 2 illustrates in simplified form the operations performed by a graphics generator, such as the Graphics Factory manufactured by Dubner Computer Systems, Inc. of Paramus, New Jersey. The graphics generator first generates three-dimensional image data, for example by specifying a set of points in space that are to be connected. The points are connected by lines, and polygons are thereby defined. The polygons are broken down into smaller and smaller polygons, until a set of polygons is obtained such that each polygon defines a surface patch that is planar within a predetermined tolerance. In this fashion, a data base is created containing locations in a three-dimensional space (X,Y,Z) of the vertices of polygonal surface patches.
It is then necessary for the operator to define the direction from which the image is to be viewed. If the image is, for example, an ellipsoid, the viewing direction might be specified as along one of the axes of symmetry of the ellipsoid. This viewing direction is fixed relative to the image. A three-dimensional transformation is then carried out on the image data so as to displace and orient the image so that the viewing direction lies on the Z axis. This is accomplished by use of a software implementation of transformation algorithms that are similar to those used by a DVE device for rotating and moving an object in three-dimensional space. A lighting model is applied to the transformed image so as to generate, for each surface patch, a perceived color that takes account of light source, viewing direction, surface color and other factors. The image data is projected into the X-Y plane by loading the color values, which may be defined by one luminance value and two chroma values, into a frame buffer using only X and Y values to address the frame buffer. By repeatedly reading the contents of the frame buffer, a digital video signal is generated representing the image when viewed along the selected viewing direction. Also, by creating successive frames in which the image is at different locations and/or orientations relative to the Z axis, movement of the image can be simulated. However, current graphics generators are not able to operate in real time.
The Graphics Factory has two digital output channels. One channel carries the digital video signal in CCIR.601 form. The other channel carries a key signal in ten bit words at 13.5 MHz.
It will be appreciated that the foregoing description is very much simplified, but since digital video effects systems and graphics generators are known in the art additional description of their operation is believed to be unnecessary.