The concept of inserting a person or subject into a background scene to form a composite image has been practiced in the motion picture industry as far back as the 1930's. Over the years improvements have been made in image compositing using photo-chemical film mattes but this process is severely restricted in scope, subject matter, backing color purity, lighting tolerance and subject colors. It is a complex and difficult process.
The chroma-key method of compositing video images for television was developed by Kennedy and Gaskins of RCA and was published in the December 1959 issue of the Journal of the Society of Motion Picture and Television Engineers, pages 804 to 812. This simple method switched between a foreground subject placed before a blue backing, and a background scene based on the presence or absence of the hue of the blue backing. Today's chroma-key systems (even with a soft edge) still switch between the foreground and background scenes.
Chroma-key is reasonably satisfactory for television newscasts when the subject is opaque and has well-defined edges. However, it is less satisfactory for television and is totally unacceptable for motion pictures when the subject includes loose hair, fog, smoke, motion blur, glassware, window reflections, out-of-focus objects, and other semi-transparent subjects. In these subjects, both the foreground and background elements occupy the same area and therefore there is no appropriate place to switch.
The proper reproduction of these semi-transparent areas requires an "AND" concept as opposed to the "OR" concept of a switch. The simulation of reality demands that everything seen by the camera, with the exception of the blue backing itself, must be reproduced in the composite, at full level, without attenuation, and at the full resolution of the camera.
A compositing method that does all of the above is based on the "AND" concept. It does not switch between the two scenes but, rather, adds the background scene to the foreground scene as a linear function of the luminance and visibility of the colored backing. The colored backing is removed by subtraction.
This compositing method was introduced to the motion picture and video industry with Vlahos U.S. Pat. Nos. 3,595,987 and 4,007,487. Subsequent improvements were added as described in Vlahos U.S. Pat. Nos. 4,100,569, 4,344,085, 4,409,611, 4,589,013, 4,625,231 and 5,032,901.
The techniques covered by the above patents form the basis for several models of compositing equipment manufactured by the Ultimatte Corporation. The current models are the Ultimatte-300, Ultimatte-45 and Ultimatte System-6.
Most recently these compositing techniques have been incorporated into a computer workstation in the form of a compositing program.
The Ultimatte compositing techniques, whether in the form of hardware or software, have achieved a level of compositing perfection such that the composite image goes undetected as a composite even under careful scrutiny. While the Ultimatte compositing techniques solve the problem of obtaining flawless undetectable composite images, its application has been limited because of its extreme complexity. Unfortunately, few operators have acquired sufficient knowledge of color science, of video and of the Ultimatte compositing techniques to achieve the image quality of which the method and equipment are capable. The complexity of the Ultimatte compositing techniques is due to the nature of color reproduction. The color spectrum is divided into three wavelength regions, designated Blue, Green and Red. All three colors are required to reproduce a full range of subject colors. One of these three colors, such as blue, must also be used for the backing but the backing must not be reproduced. This apparent contradiction requires a complex manipulation of the blue, green, red (B, G, R) color components which constitute the Ultimatte compositing techniques.
Although the Ultimatte compositing techniques are fully described in the referenced patents, a brief review will provide a basis for the invention to be described herein.
The Ultimatte process begins with the generation of a control signal E.sub.c which is used to control a number of functions. Control signal E.sub.c is proportional to the luminance and visibility of the backing. Luminance is reduced by a subject's shadow, and visibility is reduced by semi-transparent subjects and becomes zero when the subject is opaque. E.sub.c, therefore, is linearly proportional to the transparency of the subject through a range of 1.0 to 0.0. E.sub.c controls the level of the background scene video as a linear function of the luminance and visibility of the colored backing. The simplified E.sub.c equation is EQU E.sub.c =(B-K.sub.1)-K.sub.2 max (G,R)! Eq.-1
where "max" designates "the larger of".
The next step is to substitute green for blue in those areas where blue is greater than green. In such areas, the blue video channel will be carrying green video. Color substitution is achieved by causing the green video to act as a dynamic clamp so that it cannot be exceeded by blue video.
The blue backing (usually a blue paint) will have low-level green and red components. The blue clamp, keeping blue from exceeding green, causes the blue backing to become a dark gray backing, which is a necessary first step in achieving a black backing.
The color substitution of green for blue also causes color distortion. Blue eyes and green foliage become cyan colored; pink and magenta become red; and the edges of blond hair become white.
The next step is to reduce the gray backing to a black backing. If the gray backing is not reduced to black, this residual gray becomes a gray veil superimposed over the background scene.
The gray backing is reduced to black by subtracting a portion of E.sub.c, equaling the residual B, G, R video in the backing area. The four controls for this purpose are Master Veil, Blue Veil, Green Veil and Red Veil.
The above procedure reduces the blue backing to a black backing without in any way affecting the foreground subject, even when that subject is a wisp of hair, a puff of steam or a reflection from a window. Since the backing is reproduced as black there is no need to switch it off. The background scene, under control of E.sub.c, is then simply added to the foreground video to obtain the composite image. However, it is still necessary to deal with the various color distortions introduced by the blue clamp and by the blue backing. The blue clamp, while introducing certain color distortions, removes others.
A person wearing a white shirt standing in close proximity to a large expanse of blue backing receives secondary blue illumination from the backing which gives his white shirt a pronounced blue tint. Flesh tones take on a magenta tint. The field of the camera lens filled with blue light will cast a blue veil over the foreground subjects due to multiple internal reflections within the lens.
After the video is subjected to the blue clamp, all evidence that a blue backing was ever present is eliminated. White stays white, flesh colors stay flesh colored and any veiling effect of lens flare is eliminated.
The colors distorted by the blue clamp include blue eyes, pink, magenta, blond hair and green foliage. These are important colors and must be corrected. Such correction is accomplished by a series of color control gates. The color correction of these subjects is based on their unique spectral reflection. Blue eyes will have a spectral composition, for example, of 80 B, 70G and 60R. Note the linear progression of 80, 70, 60. Note that blue exceeds green by the amount that green exceeds red. This linear progression is characteristic of blue eyes and most pastel blue colors.
If the green-red difference is added to green, B.ltoreq.G+(G-R).sup.+ then the clamp will read B=70+(70-60)=80. The plus symbol ( ).sup.+ holds negative values to zero. With the clamp raised to 80 in this example, blue eyes will reproduce as blue, not cyan. The blue cast on a white shirt, however, is rejected because there is no green-red difference in white and the white shirt, therefore, returns to white.
In the case of pink, its initial B, G, R values may have been 50 B, 40 G, 80R, but the blue clamp holds blue to the level of green to become 40 B, 40 G, 80R. The red-green difference is 40, however, if the blue clamp is increased by about 1/4 of the red-green difference (B.ltoreq.G+1/4 (R-G).sup.+) it allows blue to rise to 50 which is the blue component needed to reproduce pink. The same logic applies to magenta, except that 1/2 the red-green difference is added to green.
In the case of green foliage, a typical green will contain spectral components of 20 B, 80 G, 20R. Blue light from the backing, passing through green translucent leaves, produces spectral components of 80 B, 80 G, 20R which is cyan, not green.
By limiting blue to the lower of green or red, the spectral components of foliage are returned to 20 B, 80 G, 20R. The blue clamp with the color gates becomes: EQU B.ltoreq.K G+K.sub.1 (G-R).sup.+ +K.sub.2 (R-G).sup.+ +K.sub.3 min (G,R)Eq.-2
Where
K.sub.1 =gate 1 PA1 K.sub.2 =gate 2 PA1 K.sub.3 =gate 3 PA1 1. K is nominally 1.0 for a white balance on white subjects. PA1 2. K.sub.1 (gate 1) is normally 1.0 for most scenes. PA1 3. K.sub.2 (gate 2) is normally zero which prevents flesh colors from taking on a magenta tint from blue spill light. If it is essential to reproduce pink or magenta, then K.sub.2 must be opened part way, about 1/4 for pink and about 1/2 for magenta. Opening gate 2 also permits blue flare to give flesh tones a magenta cast. It is appropriate to open gate 2 just sufficient to produce a subjectively satisfying pink or magenta and thereby minimizing the degree to which flesh tones take on a magenta tint. PA1 4. K.sub.3 (gate 3) permits blue to be limited to the lower of green or red. Since green foliage will have a high blue content due to the blue back-light, it turns cyan. Gate 3, holding blue to red, restores foliage to its natural green color. However, the use of gate 3 requires that K and K.sub.1 first be reduced to zero since either will defeat gate 3.
min means "the lower of", ( ).sup.+ plus means positive values only and the following procedures apply:
Other problems in compositing also arise and need correction. A composite, if it is not to reveal that it is a composite, must have similar elements in the two scenes match each other. For example, black areas in the foreground scene should match the color of black areas in the BG scene; white areas should match and flesh tones should match.
On the Ultimatte System-6, and its software equivalent, the foreground video and background video each have B, G, R offset controls, B, G, R gain controls and B, G, R gamma controls for a total of 18 controls. The offset, level and gamma controls are interactive and the adjustment of one may require an adjustment of the other two which, in turn, may require readjustment of the first control. Generally, an operator may go through several iterations before he is satisfied, or before he tires of trying.
A given element in two scenes may require balancing in hue-only, or in both luminance and hue. Black glossy objects reflect the blue backing and result in the background scene showing through the black object even though it is fully opaque. It is made non-transparent by means of a "Black Gloss Control".
Since E.sub.c is set to zero for gray scale subjects, light blue subjects, such as blue eyes, become transparent and must be corrected by increasing matte density. Black edging on warm colors (e.g., flesh tones) is corrected by a Matte Density Balance Control. Altogether the current Ultimatte System-6 has 98 controls consisting of 25 switches and 73 increment/decrement adjustment controls. When properly adjusted, the composite scene cannot be detected as a composite even by most video professionals.
The problem is that there are very few operators who have acquired sufficient knowledge of color science, of video, and of the Ultimatte compositing techniques, to be able to make the adjustments necessary to achieve flawless composites.
In an attempt to reduce the problem, later Ultimatte devices (U.S. Pat. No. 4,625,231, etc.) incorporate automatic circuits for setting background and foreground levels, for setting bias and for setting initial matte density. These adjustments are all subject independent. In addition, a "Reset" button returns all 98 adjustments to a default setting that will produce a good composite when the foreground and background cameras were properly balanced for white, black, and gamma and where lighting was ideal and there were no problems with the subject, wardrobe, floor glare, etc.
Such ideal conditions rarely exist. The default settings therefore provide a good starting point, but there is no way to fully automate those adjustments that are subject dependent. The automatic adjustments shorten the adjustment process but they do not substitute for a knowledgeable and skilled operator. Considering the sheer number of adjustments, and the fact that some are interactive, even a skilled operator can take up to one hour to make the necessary adjustments on a difficult scene. Operators with less expertise may never achieve a good quality composite image.
The high resolution required by the motion picture industry often utilizes non-real-time compositing. A frame of film may be scanned at resolutions up to 4000 lines in a time of 1 to many seconds. This video signal is then inputted to a computer and associated monitor for various modifications, including color correction, and is subsequently printed back to film on a second scanner. Film compositing by the computer in a workstation such as the Silicon Graphics, is made possible by expressing the equations utilized by the Ultimatte techniques in the computer language used by the workstation. Compositing is therefore one additional function of the workstation.
Substitution of the computer in place of the Ultimatte hardware increases compositing difficulties since each adjustment is delayed by the computer from about 1 second to as much as 1 minute before the operator can see the result of his adjustment. The amount of the delay depends upon the power of the computer and the pixel density (resolution) of the image.
The lack of real-time feedback makes the Ultimatte techniques almost unusable in such systems.
The Ultimatte compositing techniques utilize a series of equations using the independent variables B, G, R. These variables are scaled by constants K K.sub.1 K.sub.2 K.sub.3 K.sub.4 (1-K), etc. Scaling is achieved by an operator turning a potentiometer in analog equipment, a shaft encoder in digital equipment, or by a cursor or keyboard on a digital computer. Adjustments are intuitive and are judged to be proper as one observes the image on a video display. This is the current state of the art in video compositing.