1. Technical Field
The invention is related to a system for efficiently synthesizing textures from an input sample, and in particular, to a system and method for real-time synthesis of high-quality textures using a patch-based sampling system designed to avoid potential feature mismatches across patch boundaries.
2. Related Art
Texture synthesis has a variety of applications in computer vision, graphics, and image processing. An important motivation for texture synthesis comes from texture mapping. Texture images usually come from scanned photographs, and the available photographs may be too small to cover the entire object surface. In this situation, a simple tiling will introduce unacceptable artifacts in the forms of visible repetition and seams. Texture synthesis solves this problem by generating textures of the desired sizes. Other applications of texture synthesis include various image processing tasks such as occlusion fill-in and image/video compression. Simply stated, the texture synthesis problem may be described as follows: Given an input sample texture, synthesize an output texture that is sufficiently different from the input sample texture, yet appears perceptually to be generated by the same underlying stochastic process.
For example, one conventional scheme uses a texture synthesis approach that, although based on a Markov Random Field (MRF) model of a given input texture, avoids explicit probability function construction and consequent sampling from it. This is accomplished by generating the output image pixel by pixel in scanline order, choosing at each step a pixel from the sample image which neighborhood is most similar with respect to a specified measure to the currently available neighborhood in the texture being synthesized. However, this scheme suffers from several problems, including a relatively slow processing speed for generating textures, and a tendency to blur out finer details and well-defined edges for some textures. Further, this scheme also tends to run into problems in cases where the texture to be synthesized consists of an arrangement of relatively small objects such as leaves, flowers, pebbles, etc.
A related scheme expands on the aforementioned scheme by providing for verbatim copying and use of small pieces, or xe2x80x9cpatchesxe2x80x9d of the input sample, rather than use of individual pixels, for synthesizing the output texture. Patches are sampled from a local probability density function (PDF) using a non-parametric sampling algorithm that works well for a wide variety of textures ranging from regular to stochastic. Visual masking is used to hide the seams between the patches. However, while faster than the aforementioned scheme, this scheme is also too slow to be useful for generation of textures in real-time. Further, this scheme also suffers from a problem whereby noticeable visual artifacts can be created in the output texture. Further, depending upon the input texture, in certain cases this scheme produces output textures the bear little resemblance to the input texture and thus have little or no photo-realism.
Another conventional scheme provides a special purpose texture synthesis algorithm that is well suited for a specific class of naturally occurring textures. This class includes quasi-repeating patterns consisting of small objects of familiar but irregular size, such as flower fields, pebbles, forest undergrowth, bushes and tree branches. However, while this scheme performs fairly well with xe2x80x9cnatural texturesxe2x80x9d; it performs poorly for other textures, such as textures that are relatively smooth or have more or less regular structures such as, for example, clouds, or brick walls. In addition, this scheme is also too slow to be useful for real-time synthesis of textures.
Some of the aforementioned problems with synthesis of various texture types have been addressed by another conventional scheme that uses patch-based sampling to generate textures from an input sample. In particular, this scheme works well on stochastic textures such as, for example, a sample texture input comprising a group of small pebbles. However, where the sample texture has a more or less regular structure, such as a brick wall, or a tire tread, this patch pasting algorithm fails to produce good results because of mismatched features across patch boundaries.
Therefore, what is needed is a system and method for reliably synthesizing realistic textures for a given input sample. Such texture synthesis should be capable of generating textures for a variety of input texture types ranging from regular to stochastic. Further, such a system and method should be capable of generating textures quickly enough so as to operate in real-time.
The present invention involves a new system and method which solves the aforementioned problems, as well as other problems that will become apparent from an understanding of the following description by providing a novel approach for synthesizing textures from an input sample using patch-based sampling. A patch-based sampling system and method according to the present invention operates to synthesize high-quality textures in real-time using a relatively small input texture sample. The patch-based sampling system of the present invention works well for a wide variety of textures ranging from regular to stochastic. Further, potential feature mismatches across patch boundaries are avoided by sampling patches according to a non-parametric estimation of the local conditional Markov Random Field (MRF) density function.
The system and method of the present invention is applicable to both constrained and unconstrained texture synthesis using either regular or stochastic input textures. Examples of constrained texture synthesis include hole filling, and tileable texture synthesis. In addition, in one embodiment, as described herein, the patch-based sampling system and method of the present invention includes an intuitive randomness parameter that allows an end user to interactively control a perceived randomness of the synthesized texture.
Conventional texture synthesis schemes typically fall into one of two categories. First, one class of texture synthesis schemes compute global statistics in feature space and sample images from a texture ensemble directly. A second approach involves estimating a local conditional probability density function (PDF), then synthesizing pixels incrementally to produce an output image. The texture synthesis system and method provided by the present invention follows the second approach. Specifically, the present invention includes estimation of a local conditional PDF rather than computing global statistics in feature space and sampling images from a texture ensemble directly.
In accordance with the present invention, a Markov Random Field (MRF) is used as a texture model, and it is assumed that the underlying stochastic process is both local and stationary. The MRF is preferred because it is known by those skilled in the art to accurately model a wide range of textures. However, other more specialized conventional models, including, for example, reaction-diffusion, frequency domain, and fractals, may also be used in alternate embodiments of a system and method according to the present invention.
Note that for purposes of clarity and ease of explanation, the texture patches described herein are described as square in shape. However, it should be appreciated by those skilled in the art that any shape of texture patch, such as, for example, a rectangle, triangle, circle, oval, or any other geometric shape may be used in accordance with the system and method described herein.
Texture synthesis, according to the present invention, includes the following elements: First, the size of the texture patches that will be used for texture synthesis is determined. This determination is made either manually, or it is made automatically using conventional texture analysis techniques. Typically, as is well known to those skilled in the art, the optimum size of the texture patches is a function of the size of texture elements within the input image. However, it should be noted that, in general, as the size of the texture patch decreases, the apparent randomness of the synthesized texture increases. Similarly, as the size of the texture patch is increased, the apparent randomness of the synthesized texture will decrease.
Next, a starting patch, which is simply a randomly chosen texture patch from the input image sample, is pasted into one corner of an output image that will form the synthesized texture. A set of texture patches is then formed from the input image sample such that a xe2x80x9cboundary zonexe2x80x9d of predetermined width along the edge of each texture patch in the set matches a corresponding, overlapping, boundary zone of the randomly chosen starting patch within an adjustable xe2x80x9cdistance.xe2x80x9d The xe2x80x9cdistancexe2x80x9d between boundary zones is determined as a function of the similarity between the image elements comprising corresponding overlapping boundary zones. Smaller distances indicate closer matches. Any of a number of conventional distance metrics may be used for measuring similarity between boundary zones.
If no image patches having a distance less than a predetermined maximum distance can be found for the set of texture patches, then the set of texture patches will be empty, and a texture patch having the closest boundary zone match is pasted adjacent to the previously pasted patch, with the corresponding boundary zones of each texture patch overlapping. However, if the set of texture patches is not empty, then a texture patch is randomly selected from the set of texture patches. This randomly selected texture patch is then pasted adjacent to the previously pasted texture patch, again, with the corresponding boundary zones of each patch overlapping.
Once the texture patch has been pasted, the newly pasted patch is then used as the basis for creating a new set of texture patches having matching boundary zones, as described above. Again, if the set is empty, a randomly selected patch is then pasted adjacent to the previously pasted patch, with boundary zones again overlapping, as described above. If the set is not empty, a patch is randomly chosen from the set and pasted adjacent to the previously pasted patch, with boundary zones again overlapping, as described above.
The steps described above are repeated, with the patch pasting proceeding in scan-line type fashion, beginning in one corner of the synthesized image, and proceeding on a row-by-row basis until the synthesized texture has been completely filled with texture patches. Further, with respect to the overlapping boundary zones between patches, conventional blending operations are preformed to smooth, feather, or average the observed transition between pasted texture patches. It should be noted that this blending may be performed either after each individual pasting operation, or, in an alternate embodiment, after all pasting operations have been completed.
Additionally, it should be noted that while, in one embodiment, the texture synthesis by patch pasting proceeds in a scan-line type fashion, as described above, other pasting orders are also useful in certain circumstances. For example, with respect to occlusion filling, more realistic results are achieved by spiral pasting of patches within the occlusion, beginning from the outside edge of the occlusion, and then working around the edge and into the center of the occlusion until the occlusion is filled. In another embodiment, patch pasting proceeds in a scan-line type order, but on a column-by-column basis, rather than on a row-by-row basis.
Further, it should also be noted that each particular patch might need to match several boundary zones. For example, as a second row of patch pasting is begun, when using a scan-line type patch pasting order, boundary zones of patches in the second row, beginning with the second patch in the second row, will need to match both the boundary zones of the patch pasted immediately prior to the current patch, as well as the corresponding boundary zone of the patch in the prior row. Further, when pasting patches in a spiral order, as with occlusion filling, or where it is desired to synthesize a tileable texture, particular patch boundary zones will need to match anywhere from one to four corresponding boundary zones, assuming a square patch. The number of boundary zones required to be matched is a simple function of how many adjacent patch boundary zones the pasted texture patch must match, and therefore simply depends upon where the patch is being pasted within the synthesized image. Consequently, when generating the set from which patches are randomly selected for each patch pasting operation, the distance to each corresponding boundary zone must be determined to ensure that particular texture patches will match all surrounding texture patches that have already been pasted into the output image forming the synthesized texture.
As noted above, a set of matching texture patches is generated for each pasting operation. Real-time texture synthesis is, in part, achieved by use of an acceleration system for searching and choosing patches to be pasted from the initial input image. In general, the core computation in the patch-based sampling of the present invention can be formulated as a search for approximate nearest neighbors (ANN) to identify potentially matching texture patches. This search is accelerated to provide real-time texture synthesis by combining an optimized technique for a general ANN search with a novel data structure called the xe2x80x9cquad-tree pyramidxe2x80x9d for ANN search of images and principal component analysis of the input sample texture.
In general, real-time texture synthesis is achieved by acceleration of an ANN search for texture patches to fill the set from which patches are randomly selected. ANN searches are much faster, but less exact than brute force searching methods. However, conventional ANN searches are still too slow to allow for real-time texture synthesis using an average personal computer or the like. It is possible to accelerate an ANN search to provide for faster selection of matching texture patches. However, it is important to avoid acceleration techniques that will introduce noticeable artifacts in synthesized textures. With this principle in mind, the ANN search used by the present invention is accelerated at three levels that avoid the introduction of noticeable artifacts into the synthesized texture. Each of these three acceleration levels is used either individually, or in combination in alternate embodiments of the present invention.
In particular, a first level of acceleration of the ANN search is achieved using an optimized kd-tree. For patch-based sampling, this optimized kd-tree performs as well as a conventional bd-tree, which itself is optimal for ANN searching, but which introduces more artifacts into the synthesized texture than does the optimized kd-tree. Next, a second level of acceleration is introduced which utilizes the aforementioned quad-tree pyramid (QTP) to accelerate the ANN search by making use of the fact that the data points in the ANN search space are images. Finally, a conventional principal components analysis (PCA) is used to accelerate the search for texture patches within the given input sample texture. As noted above, each of the individual levels of acceleration can be combined to produce a compounded acceleration of the ANN search.
In general, the QTP accelerates the ANN search by providing the capability to perform hierarchical searches of image data to identify texture patches for populating the patch sets. As noted above, individual texture patches are then randomly chosen from these sets for each pasting operation. In general, the QTP provides a multilevel pyramid representation of the input image which the texture synthesis is to be based on. Unlike a conventional Gaussian pyramid, every set of four pixels in each lower level has a corresponding pixel in the next higher level. Note that each successively higher level of the QTP represents a successively lower resolution than each previous level. Successive levels of the QTP are generated by filtering the input image to generate successively lower resolution copies of the input image, with each lower resolution copy of the input image being comprised of successively lower resolution data points, e.g., image pixels.
The QTP operates to accelerate ANN searches by finding approximate nearest neighbors (ANN""s) for a query vector v. First, m initial candidates, i.e., m potential texture patch matches, are identified using the low resolution data points (image pixels) and the query vector v. In general, an m much smaller than n is chosen, with n representing the number of data points. In a working example of the present invention, an m=40 was used. Increasing m tends to increase the total number of nearest neighbors eventually identified, but it also serves to increase the time necessary for conducting searches. From the initial candidates, the k nearest neighbors are identified using the high-resolution query vector v along with the data points.
In order to accelerate the search of the m initial candidates, the input sample texture is simply filtered, such as, for example, by downsampling the image, into one or more successively lower resolution copies of the input image. A tree pyramid is then built using the successively lower resolution images. The tree node in the QTP is a pointer to an image patch and each tree level corresponds to a level in the pyramid, with the root of the tree being the initial input sample texture. When moving from one level of pyramid to the next lower resolution level, four children (lower resolution images) are computed, with different shifts along the x- and y-axis (all four directions) for each child to ensure that each pixel or patch in a given child corresponds to a patch in the next lower level (higher resolution) child of the input image. As noted above, this shift ensures that, unlike a conventional Gaussian pyramid, each pixel in a given child has four corresponding pixels in the next higher resolution image, e.g., the filtered image on the next lower level of the pyramid.
Initial candidates are selected by first comparing filtered image patches at the highest level of the QTP to a filtered copy of the previously pasted texture patch. Potential matches are then followed down through the tree to identify potential texture patch matches to the previously pasted texture patch. In particular, the k ANN texture patches are identified by following the pyramid down to the root of the QTP and computing the distance of potential texture patch matches to see if they actually match the previously pasted texture patch or patches. Consequently, the total number of potential searches is dramatically reduced with each successively higher level of the QTP. This dramatic reduction in the number of potential searches serves to drastically reduce the time necessary to identify texture patch matches for pasting into the synthesized texture, thereby facilitating real-time texture synthesis.
It should be noted, that in the extreme, as the number of levels of the QTP are increased, the reduced resolution of the higher levels will cause more and more potential texture patch matches to be missed. In a working example of the present invention, it was found that a three level QTP produced very good results while drastically reducing search time for identifying texture patch matches for pasting into the synthesized texture in comparison to other conventional patch-based texture synthesis schemes.