1. Field of the Invention
The present invention relates to digital imaging. More specifically, the present invention relates to synthesis of image textures.
2. Description of the Related Art
Texture is a fundamental phenomenon in natural scenes and exists in many areas of imaging technology. An image of grass, or a piece of tree bark, or even an array of letters can be considered as a texture. While textures often exist naturally in imaging technology, there are a host of applications where synthetically produced textures are useful. Examples include computer image generation and animation, special effects, image repair, image scaling, image compression, and a variety of other graphical and imaging applications.
Texture modeling and synthesis are complex problems. In fact, there is no single model that is capable of precisely describing all textures. Texture is a subtle concept with several defining characteristics, the three principle being randomness, periodicity, and directionality. Since texture is an important cue in human visual perception, texture processing has become an actively studied area in image processing, computer graphics, and computer vision.
A first feature, periodicity is an important feature for many textures. Textures with periodicity consist of many small elementary units, also referred to by those skilled in the art as “texels”. At some reasonable scale, most texels in a texture appear to be very similar to one another, having the similar shape, size, and orientation. However, upon closer scrutiny, texels are not typically identical to one another, and therefore, most textures are not simply repetitions of texels. Periodicity of texture seems to suggest that similar textures should have the same texel placement rules in the reproduction of texture patterns using texels.
A second feature of texture is randomness. With respect to the generation of a texture, the differences between individual texels and their placement within the texture is a stochastic process. That is, the differences may be in size, orientation, or a combination of all parameters. Therefore in terms of periodicity and randomness of a texture, texture can be considered as a sampling of a stochastic processing with the same period.
A third feature, directionality, is not present in all textures. However, when directionality is present, then there exist one or more directions with respect to which texels are aligned. As a practical matter, when the number of salient texture directions becomes larger than just a few, a texture can be treated as an inhomogeneous, non-directional texture for the purpose of texture synthesis.
Generally, the goal of texture synthesis is to generate a new synthetic texture according to a reference texture so that the new synthetic texture is both similar in appearance to the reference texture in terms of human perception, and, is sufficiently different from the reference texture so as not to appear to be a mere copy thereof. Thus, it is preferable that the reference texture contain a relatively large number of texel elements so that all the necessary features of the reference texture can be preserved. Based on the foregoing, it is understood that there exist two basic principles in texture synthesis. First, a need to retain perceived textural characteristics from the reference texture within the synthetic texture. And second, a sufficient perceived visual difference between the synthetic texture and reference texture.
It is apparent that the synthesis of textures is a challenging problem in imaging science. Given the aforementioned stochastic nature of texture, a texture can be considered as a sample of a probability function. Thus, texture analysis/synthesis can be considered as a procedure that estimates and resamples this probability function. In the 1960's, Julesz (see B. Julesz, “Visual pattern discrimination”, IRE Trans. of Information Theory IT-8, System, Man, and Cybernetics, pp. 84–92, 1962.) proposed a general texture model which states that kth order statistical information can describe texture perception effectively. In fact, methods have been developed based on the second-order statistical information of textures, including correlation and SGLDM (see R. M. Haralick, K. Shanmugan, and I. Dinstein, “Textures features for image classification”, IEEE Trans. System, Man, and Cybernetics, vol. 8, pp. 610–621, November 1973). Because there exists strong correlation between neighboring pixels of a texture, such statistical models as Gaussian Markov random field (GMRF) (see G. R. Cross and A. K. Jain, “Markov random field texture models”, IEEE Trans. On Pattern Analysis and Machine Intell., Vol. 5, No. 1, pp. 25–39, January 1983)and Gibbs distribution (see H. Derin and H. Elliott, “Modeling and segmentation of noisy and textured images using Gibbs random fields” IEEE Trans. Pattern Analysis and Machine Intell., Vol. 9, No. 1, pp. 39–55, January 1987) have also been adopted to characterize textures. More recently, multi-resolution time-frequency analysis tools such as Gabor transform (see M. Bastiaans, “Gabor's expansion of a signal into Gaussian elementary signal”, IEEE, vol.68, pp. 538–539, 1980), wavelet transform (see S. G. Mallat, “A theory for multi-resolution signal decomposition: the wavelet representation”, IEEE Trans. Pattern Anal. and Machine Intell., vol. 11, pp. 674–693, July 1989, and I. Daubechies, “Orthonormal bases of compactly supported wavelets”, Communications on Pure and Applied Mathematics, vol.41, pp. 909–996, November 1988), Wigner distribution, and Laplacian pyramid transform have been developed since neurophysiology research suggests that human visual system decomposes retinal images into different frequency bands (see D. J. Heeger, J. R. Bergen, “Pyramid-based texture analysis/synthesis”, ACM Proceedings of SIGGRAPH, pp. 229–238, August 1995).
Through the combination of statistical modeling and multi-resolution decomposition, texture analysis/synthesis models with multi-resolution statistical structures have been contemplated by Heeger. A histogram is an example of first-order statistical information, which can be applied in performing statistical matching in a multi-resolution fashion. However, because histograms do not account for positional or structural information of the texture, they only work well for highly random textures. They are even more limited when used for color textures. This is because a pixel in a color texture is determined by three components, so three individual red-green-blue (“RGB”) textures must be synthesized. Because position, or spatial, information is lost, the final combined color textures are typically poor, even if the three individual synthetic RGB images are satisfactory in and of themselves.
Zhu et al. (see S. C. Zhu, Y. N. Wu and D. Mumford, “FRAME: Filters, random fields and maximum entropy towards a unified theory for texture modeling”, International Journal of Computer Vision, vol. 27, No.2, pp.107–126, June 1996) introduced a FRAME model for textures. This model consists of two major steps. The first step involves choosing appropriate filters to capture the characteristics of the texture and extracting the histograms of filtered images. This step is very similar to Heeger's method, although Zhu et al. use different filter banks for different textures instead of a fixed steerable wavelet transform, as suggested by Heeger. The second step in the Zhu et al. method is to derive an estimate of the probability function of a reference texture using the maximum entropy principle and then use the Gibbs sampler to synthesize textures by drawing typical samples from the estimated probability function. The essence of the Zhu et al. method is that it is constrained such that a good synthetic texture corresponds to the maximum entropy of a reference probability function. However, no evidence has demonstrated that this principle is appropriate for human perception of textures although the maximum entropy principle has been widely used in statistics.
De Bonet (see J. S. De Bonet, “Multi-resolution sampling procedure for analysis and synthesis of texture image”, ACM Proceedings of SIGGRAPH, pp. 361–368, August 1997) proposed a statistical multi-resolution technique. In his method, a fundamental hypothesis is that images perceived as textures contain regions, which differ by less than a certain discrimination threshold. And therefore, random displacement of these regions does not change the perceived characteristics of the texture. Based on this hypothesis, a set of sampling constraints is imposed in sampling the coefficients of all bands in a multi-resolution technique. While De Bonet's hypothesis has utility, it is generally not adequate to achieve a good synthetic texture because random placement within homogeneous regions changes only certain micro-structures of texels and leaves the macro-structure of texels intact. De Bonet's algorithm usually produces a synthetic texture that appears as a copy of the reference texture at a low randomness threshold, and a scrambled or fuzzy version of a tessellation of the reference texture image at a high randomness threshold.
Portilla and Simoncelli {see E. P. Simoncelli, J. Portilla, “Texture characterization via joint statistics of wavelet coefficient magnitudes”, IEEE Inter. Conf. On Image Processing}, vol. 1, pp. 62–66, October 1998) proposed a scheme for texture representation and synthesis using correlation of complex wavelet coefficient magnitudes. This scheme falls in the same category as those of Heeger and De Bonet. Unlike previous techniques, a performance improvement resulted from the inclusion of a cross-correlation between the coefficient magnitudes, which was shown to improve texture synthesis in a number of aspects including directionality and periodicity of textures. While this method can capture both stochastic and repeated textures well, it fails when generating highly structural textures.
A non-parametric method based on the Markov random field (“MRF”) model was developed by Efros (see A. Efros and T. Leung, “Texture synthesis by non-parametric sampling”, IEEE Int. Conf. on Computer Vision, vol. 2, pp. 1033–1038, September 1999). The advancement taught by the Efros method is that it employs the deterministic search in the reference texture to estimate the conditional probability distribution function through histograms, and then synthesizes each pixel by sampling the histograms. This method can preserve the local structures to a significant degree. However, a review of the images produced by this approach makes apparent that the algorithm actually copies the pixels of the reference texture image into the synthetic image using probability sampling. Therefore, if it is desired to synthesize a new texture of the same size as the reference image, the Efros method may actually produce, more or less, a copy of the reference texture. Furthermore, the Efros algorithm is very processor intensive and time-consuming because it will synthesize a texture, pixel by pixel, through an exhaustive search.
In a work related to Efros, Wei (see L. Y. Wei and M. Levoy, “Fast texture synthesis using tree-structured vector quantization”, ACM Proceedings of SIGGRAPH, 2000) also developed a texture synthesis based on MRF. The difference between Wei's method and Efros's method is that Wei employs deterministic search matching to directly synthesize each pixel instead of estimating a histogram, as in Efros's algorithm. Thus, Wei's algorithm should achieve faster synthesis. Moreover, Wei proposed to use tree-structured vector quantization to further speed up the synthesis. However, a notable shortcoming for such types of MRF based methods, including both Wei's and Efros's methods, is that they typically do not reproduce clear structures in the synthetic image.
Xu et al. (see Y. Q. Xu, B. N. Guo and H. Shum, “Chaos mosaic: fast texture synthesis”, Microsoft Research Report, April 2000) proposed a method of texture synthesis that employs a random block sampling approach. Xu teaches the use of a cat map (see V. 1. Arnold and A. Avez, Ergodic problems of Classical Mechanics, Benjamin, 1968) iteration as a chaos transformation to produce a texture. However, the iterative block moving in Xu's method breaks the local features. Also, since Xu et al. do not employ a multi-resolutional statistical process, the corresponding benefits are not realized. It is also apparent that the cat map approach fails particularly with respect to directional textures.
Thus there is a need in the art for a method and apparatus to synthetically produce textures from reference textures that meet the subtle requirements of similarity and distinctiveness, while also preserving directionality, repetitiveness, and randomness perceived in the reference texture.