It is generally known that compositing computer graphics and film imagery requires that the computer graphics elements be matched in terms of color and texture to the underlying background elements. Typically these background elements have been captured on film. It is also known that intercutting footage from different sources such as two kinds of film, or film and digital video, requires that the different elements match to create a seamless look for the entire image sequence. With more footage from mixed sources and computer graphics, there is a clear need for a robust grain simulation model. Moreover, there is a need for a fully automatic method of estimating grain parameters in image sequences to reduce the need for user interaction, which can be quite time consuming.
Grain in digitized film images is well characterized by zero-mean Gaussian distributed noise for the three color channels. Zero-mean Gaussian models are fully characterized by their second-order statistics (the auto-covariance function). Modelling the interactions between color channels requires analysis of the spectral correlations. These spectral and spatial correlations can then be used to shape a white noise signal to create synthetic grain. In this way, synthesis of a simulated grain pattern is essentially a texture synthesis problem. But correct shaping of the white noise is required to generate a visually indistinguishable “match” for a given film grain.
Texture analysis/synthesis methods are well known in the literature. Bergen and Heeger (“Pyramid-Based Texture Analysis/Synthesis” SIGGRAPH, 1995, pp. 229-238) describe a method for analyzing a texture pattern and then replicating it. This method suffers from improper color correlation techniques generating significant bias in different color channels. The problem lies with the simple fact that once white noise fields have been convolved, the autocorrelation function (ACF) is non-zero for each color channel. Adding these together using a color transformation generates the correct spectral correlations but biases the spatial correlations. Biased spatial correlations will result in synthesized textures that look different from the desired texture.
Chellappa and Kashyap considered autoregressive models for texture synthesis (“Texture synthesis using 2-d noncausal autoregressive models,” IEEE Trans. on Acoustics, Speech, and Signal Processing, vol. 33, pp. 194-203, February 1985) but the optimization methods required in that work are inefficient (i.e., long processing times for both analysis and synthesis).
The newer area of non-parametric synthesis requires templates of grain to perform the synthesis and is also known to be more computationally intensive than other methods. For example, Efros and T. Leung, (“Texture synthesis by non-parametric sampling,” in ICCV99, pp. 1033-1038, 1999) considered how to sample pattern templates to generate new versions of the pattern. This is time consuming and cannot produce a wide enough variety of patterns when considering signal dependent noise.
With regard to noise estimation, most methods use single frames. Consider, for example, Olsen (“Estimation of Noise in Images: An Evaluation,” GMIP(55), No. 4, Jul. 1993, pp. 319-323) and Immerkaer, (“Fast Noise Variance-Estimation”, Computer Vision Image Understanding(64), No. 2, September 1996, pp. 300-302). These two methods estimate noise in generally flat regions of a single image. Motion compensated noise reduction methods are also known in the literature (see for example Dubois, E., Sabri, S., “Noise Reduction in Image Sequences Using Motion-Compensated Temporal Filtering,” IEEE Trans. Communications (32), 1984, pp. 826-831). Although these methods are used for image sequences, they ignore the concept of iteratively estimating both motion and noise parameters.
There are also relevant patents in the prior art. Published U.S. patent application Ser. No. 2002/0034337 (entitled “System for Manipulating Noise in Digital images”) discloses a method for shaping white noise to create synthetic noise with specific characteristics. The most obvious problem with this approach is the assumption that the color channels can simply be decorrelated and the noise synthesized independently. The subband decomposition procedure disclosed therein follows closely the treatment by Bergen and Heeger mentioned above. Indeed, it suffers from exactly the same improper color correlation techniques, thereby generating significant bias in different color channels.
In commonly-assigned U.S. Pat. No. 5,641,596 (which issued Jun. 24, 1997 and is entitled “Adjusting Film grain properties in digital images” and), D. Cok and R. Gray disclose a method for synthesizing film grain using an autoregressive model that yields very good results. To achieve a better match to older or unknown film stocks, a larger spatial support is necessary. The model in that work had a limited spatial support and as scanning resolutions increase, spatial support needs to grow accordingly.
The prior art disclosed above is deficient in its ability to efficiently generate a visually indistinguishable “match” for a given film grain. What is needed is a method of synthesizing noise that visually approximates a given film grain and is also close in a numerical sense to the statistics in a given random field. In particular, it would be desirable to perform grain matching when given the statistics for two different images. Finally, it would be desirable to have a robust method for automatically determining those statistics for image sequences.