This invention relates to identifying intrinsic pixel colors and pixel opacities in a region of uncertain pixels.
A common task in the manipulation of digital images is the removal of one or more foreground objects from a scene and the composition of this object with a new background image. This is typically a difficult task for several reasons:
1) blending of an object with the background scene: a pixel at an edge of an object may have contributions from both the foreground and the background, its color is consequently a blend of the two regions;
2) object complexity: even for objects with hard edges, the object border often contains detail that requires tedious effort to define manually; and
3) combinations of 1) and 2): an example is hair or fur, the shapes are complex and regions with thin fibers lead to color blending.
In general, the problem does not have a simple unambiguous solution. The movie industry has handled this by simplifying the scene, by filming objects or people against a simple background (blue screen) having as uniform a color as possible. Techniques have been developed to produce approximate solutions in this situation. Software products that can be used to mask an object, require a great deal of manual effort for complex objects such as subjects with hair. Existing products also enable a degree of color extraction from simplified background scenes by applying operations to the color channels.
In general, in one aspect, the invention features processing a digital image that includes first and second regions by estimating an intrinsic color of a given pixel located in an area of interest that is adjacent to at least one of the first and second regions. The estimating includes extrapolating from colors of multiple pixels in one of the first and second regions and multiple pixels in the other of the two regions.
Implementations of the invention may include one or more of the following features. The original color of the given pixel relates to the original colors of pixels in both the first and second regions. The estimated intrinsic color of the given pixel relates to original colors in only one or the other of the first and second regions. The area of interest includes one of the first and second regions; or is adjacent to both of the first and second regions. The first region is a foreground object and the second region is a background.
The first and second regions have any arbitrary degree of color variation in the visible spectrum over a spatial scale that is on the same order of magnitude or smaller than the minimum span of the area of interest. The estimating includes analyzing both the color and spatial proximity of pixels in the first and second regions.
The estimating includes extrapolating from the closest pixels in the first and second regions; or flowing colors into the area of interest from one or both of the first and second regions. The flowing of colors includes averaging of color values for each of a set of pixels in the first region and a set of pixels in the second region. The digital image includes layers of pixel information and the estimating is based on pixel information in only one of the layers; or in other implementations on pixel information in a composition of all the layers.
An opacity value is determined for the given pixel, indicative of the extent to which the intrinsic color of the given pixel relates to original colors in the first and second regions, based on a result of the estimating of the intrinsic color. The given pixel includes original opacity information, and the opacity value is also based on the original opacity information. In some implementations the opacity determination includes use of a neural network trained on the image original colors and estimated intrinsic colors. The opacity values are used to composite one of the first and second regions with another digital image.
The estimating includes extrapolating estimates of intrinsic colors of the first and second regions using searches in color space and image coordinate space. The estimating assumes a linear blending model. The estimating includes flowing colors from edges of the area of interest to fill the area of interest with estimates of the colors of the first and second regions.
Estimating the intrinsic color includes determining two color sample sets for the given pixel, each of the color sample sets being associated with one of the first and second regions, and estimating the intrinsic color based on the two color sample sets. The original color of the given pixel is compared with colors in the color sample sets. A single color is selected from each of the color sample sets based on an error minimization technique.
In general, in another aspect, the invention features enabling a user to paint an area of the digital image to identify at least an area of interest adjacent to at least one of a first region and a second region. After the user has defined the area of interest, the intrinsic colors of pixels in the area of interest are estimated based on color information for pixels in the first region and the second region.
Implementations of the invention may include one or more of the following features. The painting is done with a brush tool that can be configured by the user. The painted area can be built up by repeated painting steps and portions of the painted area can be erased by the user interactively. The user paints the area of interest and separately identifies a location that is in one of the first and second regions. Or the user paints at least one of the first and second regions and the area of interest and separately identifies a color associated with one of the first and second regions. The user designates one of the first and second regions by identifying a pixel location in that region. The user identifies the color by applying an eyedropper tool to one pixel or a set of pixels in the one region. One of the regions is flood filled based on the identified pixel location to designate that region as a foreground. The painted area may be modified by a user interactively and repeatedly. The user is enabled to paint additional areas of interest between other pairs of first and second regions.
In general, in another aspect, the invention features receiving a mask associated with an area of interest in a digital image, the mask including values representing opacities of pixels in the region of interest with respect to an adjacent region of interest. Intrinsic colors for the pixels are estimated based on the mask.
In general, in another aspect, the invention features enabling a user to control an original extraction by manipulating a brush on a display of the image, enabling the user to control a touch up extraction following the original extraction, and considering a pixel identified for touch up extraction only if the pixel was of uncertain color in the original extraction.
Implementations of the invention may include one or more of the following features. An intrinsic color is determined for each of the pixels that were of uncertain color based on a forced foreground or background color. The forced color is selected by the user or is determined automatically from the original colors within the foreground region.
In general, in another aspect, the invention features determining, for each pixel in an area of interest in a digital image, the nearest pixel in a first region of the image that is adjacent to the area of interest and the nearest pixel in a second region of the image that is adjacent to the area of interest. A processing area is defined that is smaller than the image. A pixel window is defined that is smaller than the defined processing area. The processing area is scanned at a succession of overlapping positions that together span the image. At each overlapping position of the processing area, the pixel window is scanned across the processing area. At each position of scanning of the pixel window, stored information for pixels in the window is updated, the stored information relating to nearest pixels in the first and second regions.
Implementations of the invention may include one or more of the following features. The processing area includes a rectangle twice as long is high, and in each of the succession of positions the processing area is offset from the prior position by half the length of the rectangle. The pixel window includes a square. The scanning of the processing area and the scanning of the pixel window occur in both forward and backward passes that span the image.
In general, in another aspect, the invention features a method for a user to extract an object from a background in an image. The image is displayed. A painting tool is selected and its characteristics adjusted. The painting tool is used to paint a swath around the object. The swath includes pixels whose membership in the object or the background are uncertain and include pixels that with certainty belong to the object and to the background. At least one pixel is marked that is known to belong to the object or the background. A program is invoked to perform the extraction. The quality of the extraction is observed. Depending on the observation, a painting tool is used to control a touch-up extraction.
Complex objects in complex scenes can be accurately extracted, dropping out the background pixels to zero opacity (totally transparent). Objects with complex topologies (lots of holes) can be extracted. A simple user interface allows the user to select all of the regions that are to be designated as foreground by an intuitive process of clicking the mouse over each region, obtaining immediate visual feedback of the selected regions. Only a small fraction of the memory needed to store the image is required to be resident in the computer""s (R)andom (A)ccess (M)emory at any given time. This is a key advantage over more obvious approaches to solving this problem, which require storing and processing data whose size is comparable to multiple copies of the image. For example, a 5000 by 5000 pixel RGBimage with transparency information contains approximately 100 (M)ega (B)ytes of data. More obvious implementation of the methods might require storing in RAM several hundred MB at once. The preferred embodiment of this invention requires less than 2 MB, and this requirement can be decreased even further in alternative embodiments. The method achieves an effective balance between speed of operation and memory requirements. More obvious implementations are either much slower (and scale poorly as the image size is increased) or require much more RAM. The user has the flexibility to highlight the object in one step as well as the ease of modifying the outline by erasing or by adding additional paint. In some implementations, the user need not preselect the foreground and background colors. The masking and extracting of objects from digital images is achieved with high accuracy. Multiple objects can be extracted from an image in a single step.