In the past decade, the field of computer graphics has experienced numerous advances in the field of processing images. Researchers and developers have described how to sample real world images, and using the samples to synthesize novel views of the real world in a virtual environment, rather than recreating the physical world from scratch.
In turn, this has generated interest in texture synthesis methods. In computer graphics, “texture” is a digital representation markings on a surface of an object. In addition, texture captures other qualities, such as color and brightness. Texture can also encode transparent and reflective qualities. After a texture has been defined, the texture can be “wrapped” around a 3D object. This is called texture mapping. Well-defined textures are very important for rendering realistic images. However, textures generally require a lot of storage and take time to acquire, therefore, synthetic texture generation is an important field.
Texture synthesis should be able to take a small sample of texture, ideally as small as possible, and generate an unlimited amount of image data. The synthesized texture may not be exactly like the original, but most viewers should perceive it as such.
Furthermore, the method should be able to map the synthesized texture to any arbitrary model or object.
While the problem of texture analysis and synthesis from real images has had a long history in the field of computer vision and statistics, it was not until recently that the quality of results reached a level acceptable for use in computer graphics, see David J. Heeger and James R. Bergen, “Pyramid-based texture analysis/synthesis,” SIGGRAPH '95, pages 229-238, 1995. They described a texture model in terms of histograms of filter responses at multiple scales and orientations. It turned out that matching these histograms iteratively at different spatial scales was enough to produce impressive synthetic results for stochastic textures. However, their method did not capture important relationships across scales and orientations because the histograms measure marginal, not joint, statistics. Thus, their method failed for highly structured textures.
Several attempts have been made to extend their model to capture a wider range of textures, including J. S. De Bonet, “Multiresolution sampling procedure for analysis and synthesis of texture images, “SIGGRAPH '97, pages 361-368, 1997. De Bonet sampled from conditional distribution over multiple scales. Bonet was extended by Portilla et al., see Javier Portilla and Eero P Simoncelli, “A parametric texture model based on joint statistics of complex wavelet coefficients. International Journal of Computer Vision, 40(1):49-71, December 2000. They matched both first and second order properties of wavelet coefficients. While important from a theoretical point of view, neither method was successful at capturing local detail of many structured textures.
A different approach was to directly model pixels given their spatial neighborhoods, see Alexei A. Efros and Thomas K. Leung, “Texture synthesis by non-parametric sampling,” International Conference on Computer Vision, pages 1033-1038, September 1999. They described a simple method of “growing” texture one pixel at a time. The conditional distribution of each pixel, given all its neighbors synthesized so far, was estimated by searching the sample image and finding all similar neighborhoods. That method produced very good results for a wide range of textures. However, a full search of the input image was required to synthesize every single pixel, which made the method very slow.
That method was accelerated by about two orders of magnitude by using a multi-scale image pyramid, and clustering pixel neighborhoods, see Li-Yi Wei and Marc Levoy, “Fast texture synthesis using tree-structured vector quantization,” SIGGRAPH 2000, pages 479-488, 2000, based on work described by Kris Popat and Rosalind W. Picard, “Novel cluster-based probability model for texture synthesis, classification, and compression,” Proc. SPIE Visual Comm. and Image Processing, 1993. However, with these optimizations, the best matching neighborhoods were frequently not found. Therefore, many textures, especially these with high frequency structure, such as images of text, were not well synthesized.
Another very simple method took random square blocks from an input texture and placed the blocks randomly onto a synthesized texture, see Xu, B. Guo, and H.-Y. Shum, “Chaos mosaic: Fast and memory efficient texture synthesis,” Technical Report MSR-TR-2000-32, Microsoft Research, April 2000. That method included alpha blending to avoid edge artifacts. While their method failed for highly structured textures, e.g., a checker-board pattern, due to boundary inconsistencies, it worked no worse than other more complicated methods for most semi-stochastic texture methods.
One curious fact about the one-pixel-at-a-time synthesis method of Efros et al. was that for most complex textures very few pixels actually had a choice of values that could be assigned to them. That is, during the synthesis process most pixels had their values totally determined by what had been synthesized so far. For example, if the pattern was circles on a plane, then soon after the synthesis of a particular circle was started, all the remaining pixels of that circle, plus some surrounding ones, were completely determined. In this extreme case, the circle would be called the texture element. This same effect persisted to a lesser extent even when the texture was more stochastic, and there were no obvious texels. This meant that a lot of searching work was wasted on pixels whose “fate” had already been determined.
It could be possible that the units of synthesis should be something bigger than a pixel. If these units could some how be determined, then the process of texture synthesis would be akin to putting together an M. C. Escher jigsaw puzzle of illusory and seamless improbable tessellations. Of course, determining precisely the size and shapes of these units, for a given texture, and how to put the units together hits at the heart of texture analysis—an open problem in computer vision.
Therefore, there still is a need for a simple texture synthesis and transfer method. The method should allow one to synthesize unlimited amounts of new textures from existing textures, and to map the synthesized textures in a consistent manner. It should be possible to synthesize textures in a fast and reliable way, and in a way that lets one control the texture synthesis. For example, it should be possible to cut and paste material properties. It should also be possible to gather data for a particle texture “style” in which something should be rendered, for example, an orange peel, and then to render some other object in that style as shown in FIG. 1.