1. Field of the Invention
The present invention relates in general to a method and apparatus for retrieving multimedia data by extracting a feature of shape information of an image using eigen vectors of a shape information covariance matrix and calculating a similarity on the basis of the extracted feature, and more particularly to a method and apparatus for retrieving multimedia data, in which the multimedia data can rapidly and accurately be retrieved by using multilayer eigen vectors capable of expressing complex shape information in detail and having a consistency against even rotation, scaling and translation of an image.
2. Description of the Prior Art
Up to now, a language-based search has mostly been used for data retrieval. However, recently, massive multimedia data composed of movies, synthetic images, still images, voice, moving images, music and others as well as characters has been present on Internet or a multimedia database according to developments of the Internet and multimedia, resulting in a need for retrieving such multimedia data. This has keenly required an effective retrieval method capable of readily retrieving multimedia data desired by the user from bulky data on the Internet or multimedia database.
Multimedia data is much larger in size than data composed of only characters, and it is a combination of various types of information such as images, sounds, characters, etc. As a result, it is next to impossible to retrieve desired multimedia data using the multimedia data itself. For this reason, in order to retrieve multimedia data from a multimedia database, respective multimedia data expressible features are previously extracted through a preprocessing procedure and then compared respectively with information in the multimedia database. For example, in the case of retrieving video with a mixture of images, voice and audio, respective features of the images, voice and audio are extracted and then calculated in similarity with information in a multimedia database to be retrieved. As a result, desired information can be retrieved in accordance with the similarity calculation. In this regard, key points in the multimedia data retrieval are the types of features of multimedia data to be considered, how to express the features and how to compare between the features. Herein, a data model expressive of each feature is called a descriptor.
A still image or moving image retrieval method is now most studied in multimedia data retrieval techniques. In such a retrieval method, features of an image, such as a color, texture, shape, etc. are extracted and then measured in similarity. For example, used as descriptors expressive of the color feature may be a color histogram, correlogram, etc. [see: J. Huang, S. R. Kumar, M. Mitra, W. J. Zhu, and R. Zabih, Image indexing using color correlation, Proc, 16th IEEE Conf. on computer Vision and Pattern Recognition, pp. 762-768, 1997]. Further, a wavelet coefficient, DFT coefficient, etc. may be used as descriptors expressive of the texture feature. In other words, various descriptors may be used to express one feature and have both merits and demerits. In this connection, the performance of a retriever may be greatly influenced by an employed descriptor.
A shape information retrieval method is one of useful methods for image retrieval. Herein, shape information of an object signifies information indicating which pixel of an arbitrary image belongs to the object and which pixel of the arbitrary image belongs to a background. For an effective shape information retrieval, it is necessary to define a descriptor capable of appropriately expressing shape information of an object and compare a similarity of the shape information on the basis of the defined descriptor. Existing descriptors used for the shape information retrieval may generally be classified into two types: geometric feature-based descriptors and moment feature-based descriptors. The geometric feature-based descriptors may generally be a parameter, area, maximum radius, minimum radius, corner, roundness, etc., and the moment feature-based descriptors may be a center of mass, orientation, bounding rectangle, best-fit ellipse, eigen vector, etc. For the purpose of accurately and rapidly retrieving image data, the above shape information descriptors should be consistent regardless of any variation of an image such as rotation, scaling, translation, etc. of an object.
At present, the multimedia data retrieval still stays at the initial stage. One of existing shape information feature extraction methods is to use eigen vectors of a covariance matrix of shape information. As shown in FIG. 1, the eigen vectors of the covariance matrix are composed of two vectors capable of expressing a distribution of the shape information. The two eigen vectors have their directions signifying two axes (i.e., major and minor axes) indicative of distribution directions of the shape information, respectively, and their magnitudes representing distribution degrees of the shape information, respectively. Here, the major axis represents a main distribution direction of the shape information, and the minor axis represents a minimum distribution direction of the shape information.
Defining a covariance matrix C as in the below equation 1, eigen vectors of the covariance matrix can be calculated in the following manner:                     c        =                  [                                                                      c                  xx                                                                              c                  xy                                                                                                      c                  yx                                                                              c                  yy                                                              ]                                    [                  Equation          ⁢                      xe2x80x83                    ⁢          1                ]            
A center of mass (mx,my) of the shape information can be expressed as in the below equation 2:                                           m            x                    =                                    1              N                        ⁢                                          ∑                                  i                  =                  0                                N                            ⁢                              xe2x80x83                            ⁢                              x                i                                                    ,                              m            y                    =                                    1              N                        ⁢                                          ∑                                  i                  =                  0                                N                            ⁢                              xe2x80x83                            ⁢                              y                i                                                                        [                  Equation          ⁢                      xe2x80x83                    ⁢          2                ]            
In the above equation 2, xe2x80x9cNxe2x80x9d indicates the total number of pixels in the shape information, and xe2x80x9cxixe2x80x9d and xe2x80x9cyixe2x80x9d indicate the position of an ith pixel. Calculating the center of mass as in the above equation 2, respective components Cxx, Cyy, Cxy and Cyx of the covariance matrix can expressed by the following equation 3:                                           c            xx                    =                                    1              N                        ⁢                                          ∑                                  i                  =                  0                                N                            ⁢                              xe2x80x83                            ⁢                                                (                                                            x                      i                                        -                                          m                      x                                                        )                                ⁢                                  (                                                            x                      i                                        -                                          m                      x                                                        )                                                                    ⁢                  
                ⁢                              c            xy                    =                                    1              N                        ⁢                                          ∑                                  i                  =                  0                                N                            ⁢                              xe2x80x83                            ⁢                                                (                                                            x                      i                                        -                                          m                      x                                                        )                                ⁢                                  (                                                            y                      i                                        -                                          m                      y                                                        )                                                                    ⁢                  
                ⁢                              c            yx                    =                                    1              N                        ⁢                                          ∑                                  i                  =                  0                                N                            ⁢                              xe2x80x83                            ⁢                                                (                                                            y                      i                                        -                                          m                      y                                                        )                                ⁢                                  (                                                            x                      i                                        -                                          m                      x                                                        )                                                                    ⁢                  
                ⁢                              c            yy                    =                                    1              N                        ⁢                                          ∑                                  i                  =                  0                                N                            ⁢                              xe2x80x83                            ⁢                                                (                                                            y                      i                                        -                                          m                      y                                                        )                                ⁢                                  (                                                            y                      i                                        -                                          m                      y                                                        )                                                                                        [                  Equation          ⁢                      xe2x80x83                    ⁢          3                ]            
In the above equation 3, the components Cxx and Cyy of the covariance matrix indicate x-axis and y-axis distribution degrees of the shape information, respectively, and the components Cxy and Cyx of the covariance matrix indicate a correlation between x and y coordinates.
Defining eigen vectors of the covariance matrix C obtained in the above manner respectively as A1 and A2 and eigen values of the covariance matrix C respectively as r1 and r2, the following equation 4 is established therebetween:                                           [                                                                                c                    xx                                                                                        c                    xy                                                                                                                    c                    yx                                                                                        c                    yy                                                                        ]                    *                      A            1                          =                                            r              1                        *                                          A                1                            ⁢                              
                            [                                                                                          c                      xx                                                                                                  c                      xy                                                                                                                                  c                      yx                                                                                                  c                      yy                                                                                  ]                        *                          A              2                                =                                    r              2                        *                          A              2                                                          [                  Equation          ⁢                      xe2x80x83                    ⁢          4                ]            
Consequently, the eigen vectors A1 and A2 and eigen values r1 and r2 of the covariance matrix C can be obtained by solving the above equation 4. As mentioned above, the eigen vectors A1 and A2 of the covariance matrix C represent the main and minimum distribution directions of the shape information, respectively, and the eigen values r1 and r2 of the covariance matrix C represent the distribution degrees of the shape information in the main and minimum distribution directions, respectively.
The above-mentioned shape information feature extraction method using the eigen vectors of the covariance matrix is able to express an approximate distribution of the shape information with a small amount of data and thus has advantages in that it is small in calculation amount, simple in calculation algorithm and has a consistency against translation of the shape information. However, the above-mentioned shape information feature extraction method is disadvantageous in that it has a limitation in accurate shape information expression because it should express the entire shape information using only two eigen vectors in a single layer. In other words, eigen vectors to be calculated with respect to different types of shape information may often have the same value, resulting in a grievous situation. Further, the eigen vectors have no consistency against scaling or rotation of the shape information. As a result, the eigen vectors are insufficient to define a descriptor for expression of the shape information, leading to a reduction in the accuracy of the associated multimedia data retrieval method.
Therefore, the present invention has been made in view of the above problems, and it is an object of the present invention to provide a method and apparatus for retrieving multimedia data, in which the multimedia data can accurately and rapidly be retrieved by adopting a shape information feature extraction method and apparatus capable of expressing complex shape information in detail using multilayer eigen vectors of a covariance matrix for the definition of a descriptor of the shape information.
It is another object of the present invention to provide a method and apparatus for retrieving multimedia data, in which the multimedia data can accurately and rapidly be retrieved by defining and using a shape information descriptor with a consistency against rotation, scaling and translation of an object.
In accordance with one aspect of the present invention, the above and other objects can be accomplished by a provision of a method for retrieving multimedia data using shape information, comprising the first step of receiving shape information of a query image and extracting a feature of the received shape information using a shape information descriptor based on eigen vectors of a multilayer covariance matrix; the second step of extracting a feature of each image data in the same manner as the above first step; the third step of creating a multimedia database on the basis of the features extracted at the above second step; the fourth step of comparing the feature of the query image with the features of the image data in the multimedia database to calculate similarities therebetween; and the fifth step of outputting the results calculated at the above fourth step.
In accordance with another aspect of the present invention, there is provided an apparatus for retrieving multimedia data using shape information, which is capable of embodying the above multimedia data retrieval method.