1. Field of the Invention
The present invention generally relates to the field of graphics and image processing and, more particularly, to controlled and meaningful shape modifications. This shape modification has potential applications in the field of user assisted image manipulation and in the framework of information search and retrieval with emphasis on retrieving images which contain objects of shape similar to the query object shape.
2. Background Description
One of the classical dilemmas of content based image retrieval schemes concerns the choice between what the system thinks the user is interested in and what the user is actually interested in. It is the intent of search procedures to elicit responses from the user that reduce this dilemma. The computer usually stores the features of the image in some numerical form, and the search is performed according to some measure of similarity between the query image, with which the user starts the search, and the images stored in a database. The difference between user""s subjective notion of similarity and the objective similarity function used by the computer gives rise to this inconsistency.
There is a need to provide users with choices that they may not have realized were available. For example, if a user expresses an interest in a graphical object, how should a search engine broaden the scope of the search? One possibility is to devise a query scheme eliciting further responses to related stimuli, both graphical and otherwise.
In our view, a better approach is to modify the original graphical stimulus in multiple directions and learn through the user""s responses such modifications. Our approach is thus bottom up, instead of top down. In the process, we can develop a tool by which human visual perception can be learned both in user-free (common architectures for image retrieval) and user-driven (learning algorithms from user perceptions) dimensions. This composite view of learning from images is what the present invention is about.
Many systems implement ways of retrieving images based on the similarity of the shape of an object of interest in the query image. The images in the database are segmented, and each segment is assumed to correspond to an object. The shape of an object is given by the region which encloses the object and is stored in the database as the points on the boundary. Sometimes features of interest are derived from these points, joining which we can get the polygonal approximation to the shape. These points could be the high-curvature points on the boundary. In similar-shape retrieval systems, the search is performed by starting with an image, segmenting it and choosing a particular segment from that image, or with a sketch which the user draws. We shall call this the query shape. A general overview of similar-shape retrieval systems is given in xe2x80x9cSimilar-Shape Retrieval in Shape Data Managementxe2x80x9d, by Rajiv Mehrotra and James E Gary, in IEEE Computer Magazine, 28(9), September 1995.
The shape of the object is represented in some numerical form. Most commonly used shape representations include region-based descriptors like moment invariants, or boundary-based Fourier descriptors. Details and references to the methods of Moment invariants and Fourier descriptors for the shape can be found in the book Digital Image Processing by R. C. Gonzalez and R. E. Woods, Addison-Wesley. Early references to Fourier descriptors can be found in xe2x80x9cFourier descriptors for plane closed curvesxe2x80x9d, by C. Zahn and R. Roskies, IEEE Trans. on Comput., vol. C-21, no. 3, March 1972. Given a shape representation scheme, two shapes are compared for closeness using some similarity measure. The similarity measure inputs two shape representative numerical vectors and outputs one number, which gives the closeness of the two shapes.
We first describe the invention in the specific context of modifying a single shape. For the technical purposes of this description, a shape is the boundary of the two-dimensional image of an object. A shape is coded as an ordered sequence of two-dimensional points. We can find these points of interest on the boundary of the object by taking, for example, the high-curvature points, joining which we can get a polygonal approximation of the object shape. In our invention we shall assume that the original shape has N interest points and this ordered set of points is written as complex numbers (x1, x2, . . . , xN). We shall suppose that we have at our disposal an algorithm that can extract a shape that the user finds interesting and compute this set.
At the center of our invention is the distinction made between two kinds of features this shape may have.
1. Macrofeatures: We give this name to features that characterize the entire shape. For example, a tree has a trunk and branches as macrofeatures. Transformations of the shape that transform these macrofeatures are termed-global deformations.
2. Microfeatures: We give this name to features that describe only a region of the shape. In the example of the tree, the arrangement of leaves on a branch is a microfeature and differs from branch to branch. Such features are transformed by local deformations.
We have devised a strategy to distinguish between these two classes of features and deformations.
Step 1: Describe or represent the shape in a suitable coordinate system. The choice of this coordinate system (or basis) is central to the invention and will be further discussed for a specific implementation.
Step 2: In the new coordinate system, the macrofeatures are captured by the first few coordinates with the remaining representing microfeatures. Precisely how many coordinates represent macrofeatures can be user-driven or defined by some class that the shape belongs to.
Step 3: Once we have the shape feature projections onto the macrofeature dimensions and the microfeature dimensions, we deform the macrofeatures. These deformations are global deformation. Then we combine the microfeature to the deformed macrofeature to get the deformed object. This is the principal contribution of the invention.
Steps 1 to 3 are described mathematically as follows:
The coordinates of the points of the shape can be written in terms of basis vectors of another coordinate system as follows,             [                                                  x              i                                                                          y              i                                          ]        =                  ∑                  j          =          1                n            ⁢              xe2x80x83            ⁢                        [                                                                      a                  ij                  xx                                                                              a                  ij                  xy                                                                                                      a                  ij                  yx                                                                              a                  ij                  yy                                                              ]                ⁡                  [                                                                      b                  j                  x                                                                                                      b                  j                  y                                                              ]                      ,      xe2x80x83    ⁢      i    =    1    ,  …  ⁢      xe2x80x83    ,  n  ,
where the basis vectors are             b      j        =          [                                                  b              j              x                                                                          b              j              y                                          ]        ,      xe2x80x83    ⁢      j    =    1    ,  …  ⁢      xe2x80x83    ,      n    .  
Macrofeatures (global) are features of the shape, which are given by             [                                                  x              i                              (                g                )                                                                                        y              i                              (                g                )                                                        ]        =                  ∑                  j          =          1                m            ⁢              xe2x80x83            ⁢                        [                                                                      a                  ij                  xx                                                                              a                  ij                  xy                                                                                                      a                  ij                  yx                                                                              a                  ij                  yy                                                              ]                ⁡                  [                                                                      b                  j                  x                                                                                                      b                  j                  y                                                              ]                      ,
where m( less than n) indicates the number of macrofeatures.
Microfeatures (local) are features of the shape, which are given by             [                                                  x              i                              (                l                )                                                                                        y              i                              (                l                )                                                        ]        =                  ∑                  j          =                      m            +            1                          n            ⁢              xe2x80x83            ⁢                        [                                                                      a                  ij                  xx                                                                              a                  ij                  xy                                                                                                      a                  ij                  yx                                                                              a                  ij                  yy                                                              ]                ⁡                  [                                                                      b                  j                  x                                                                                                      b                  j                  y                                                              ]                      ,      xe2x80x83    ⁢      i    =    1    ,  …  ⁢      xe2x80x83    ,      n    .  
The macrofeatures are deformed to             [                                                                  x                ~                            i                              (                g                )                                                                                                        y                ~                            i                              (                g                )                                                        ]        =                  ∑                  j          =          1                m            ⁢              xe2x80x83            ⁢                        [                                                                                          a                    ~                                    ij                  xx                                                                                                  a                    ~                                    ij                  xy                                                                                                                          a                    ~                                    ij                  yx                                                                                                  a                    ~                                    ij                  yy                                                              ]                ⁡                  [                                                                      b                  j                  x                                                                                                      b                  j                  y                                                              ]                      ,      xe2x80x83    ⁢      i    =    1    ,  …  ⁢      xe2x80x83    ,      n    .  
The shape then is deformed to             [                                                                  x                ~                            i                                                                                          y                ~                            i                                          ]        =                  [                                                                              x                  ~                                i                                  (                  g                  )                                                                                                                          y                  ~                                i                                  (                  g                  )                                                                    ]            +              [                                                            x                i                                  (                  l                  )                                                                                                        y                i                                  (                  l                  )                                                                    ]              ,      xe2x80x83    ⁢      i    =    1    ,  …  ⁢      xe2x80x83    ,      n    .  
A prime candidate for a coordinate system is the Fourier basis. Use of this basis in a visual system is suitable when visual symmetry is of prime concern. The representation of the shape in the Fourier basis can use the techniques similar to the ones used in extracting the Fourier descriptors of the shape. A method to find Fourier descriptors of a polygon given by ordered set of complex numbers (x1, x2, . . . , xN) is given in xe2x80x9cApplication of Affine-Invariant Fourier Descriptor to Recognition of 3-D Objectsxe2x80x9d, by Maus Arbter, Wesley E. Snyder, Hans Burkhardt and Gerd Hirzinger, IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 12, No. 7, July 1990.
We first represent the shape in terms of its Fourier coefficients. Once we have a Fourier representation of the shape, we can characterize macrofeatures by a few of the Fourier coefficients. If we consider only one Fourier coefficient, which represents the center of the object, the macrofeature represents only the position. If we additionally consider two more Fourier coefficients, then the macrofeatures are all captured by a transformed circle. In such a case, area, orientation and aspect ratio also become macrofeatures. More detail can be added by considering more pairs of Fourier coefficients. FIG. 8 illustrates the macrofeatures with increasing number of Fourier coefficients. Applying a template filter, which is low-pass filter in this case, we divide the set of Fourier coefficients into two parts. The first, which passes through the low-pass filter, represents the macrofeatures, and the second, the xe2x80x9cresidualxe2x80x9d, represents the microfeatures of the shape. The bandwidth of the low-pass filter is, in our system, a user-tunable or a system specified parameter and defines the appropriate macrofeature-microfeature separation for the shape.
Once we have the shape represented by only macrofeatures, we can apply deformations to them so as to capture the user""s perception of changes to the global aspects of the shape. These deformations can be taken from a library of currently available deformations. For example, if we take three Fourier parameters, which describes the position, area, orientation and aspect ratio of the object. We can, for example, use finite element methods that deform using a physical structure for the background of the shape, as described in xe2x80x9cModal matching for correspondence and visionxe2x80x9d, by S. Sclaroff and A. Pentland, in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 17, pp. 545-561, June 1995.
An alternative way of describing our invention involves letting the macrofeatures define the template filter. Transformations and deformations are on this template only and not on the microfeatures. This view of our invention also allows us the use of partial or sub-sampled templates. The advantage we gain from this strategy is twofold:
(i) we enlarge the class of templates without additional visual input, and
(ii) we can enforce visual and other symmetries even if they are not built into the deformations being used.
The sub-sampling scheme can be automated by a natural use of multiresolutions if a dyadic (power of two) number of points define the shape.
Once the global deformations are made to the macrofeatures, the microfeatures are added back in to restore the shape. If required, we allow for some xe2x80x9csmoothingxe2x80x9d by local deformations, but the principal purpose of this phase of the invention is to leave local features intact. For example, deforming macrofeatures of a tree can simulate the effects of a windy day. Retaining microfeatures ensures that the leaves are still associated with their branches as in the undisturbed tree and not themselves deformed out of shape.