1. Field of the Invention
The present invention relates to a multimedia data retrieval device located between a server which stores a plurality of contents representing images, sounds, and the like and a client who desires to retrieve content, for searching the contents to retrieve the content desired by the client and providing the retrieved content to the client, and a retrieval method for such a retrieval device.
2. Description of the Related Art
A conventional system for searching multimedia contents produces miniature images representing outlines of the respective contents. Together with such miniature images, data representing the features of the contents, such as image size and dominant color information, are created as feature data. Such feature data is directly designated to retrieve a content corresponding to the designated feature data.
FIG. 17 is a view illustrating the construction of a conventional multimedia content retrieval system. Referring to FIG. 17, multimedia contents are stored in a disk 103 mounted on a disk drive 101. The contents are read from the disk 103 under control of a file server 102, transmitted to the client side via a communication line 106, and displayed on a display 104 of a computer 105.
The client inputs a feature keyword for a desired content, as shown in FIG. 18, for facilitating retrieval of the desired content. Property data representing the features of a plurality of contents stored in the disk 103 are stored in advance in the disk 103 in the form of a table as shown in FIG. 18. The computer 105 compares the feature keyword input by the client with the feature data stored in the disk 103, selects a certain number of feature data which are approximate to the feature keyword in order of most approximate to less approximate, and displays miniature images of contents having the selected feature data on the display 104. The client selects an appropriate content by referring to the displayed miniature images, thereby to obtain the desired content.
The above retrieval technique is disclosed, for example, in U.S. Pat. No. 5,761,655 titled xe2x80x9cImage File Storage and Retrieval Systemxe2x80x9d.
The above conventional retrieval technique has a disadvantage that, in the case where contents are compressed by a coding method before being stored, it is necessary to first decompress the compressed contents to produce non-compressed contents, and create feature data based on the non-compressed contents. Another disadvantage is that high-speed retrieval is not possible if feature data has not been created in advance.
In the above conventional retrieval technique, the client is requested to express a feature of a desired content by a low-level keyword such as the color, width, and height. It is not possible for the client to use high-level expression, such as xe2x80x9ca scene where a person is running in the evening sunxe2x80x9d, for example, when high-level retrieval is desired.
The multimedia data retrieval device of this invention includes: a content storage section for storing a plurality of compressed contents; a client terminal for inputting feature data; a feature data storage section for reading feature data extracted from at least one of the compressed contents from the content storage section and storing the feature data of the at least one compressed contents; and a content retrieval section for selecting feature data approximate to the feature data input via the client terminal among the feature data stored in the feature data storage section, and retrieving a content having the selected feature data from the content storage section.
In one embodiment of the invention, each of the compressed contents includes a plurality of macro blocks representing an image shape, the image shape represented by the macro blocks is converted into a value consisting of at least one bit, and the bit is used as feature data of a shape represented by the content.
In another embodiment of the invention, each of the compressed contents includes mesh-coded data representing an image shape, and the mesh-coded data is used as feature data of a shape represented by the content.
In still another embodiment of the invention, each of the compressed contents includes a plurality of macro blocks representing an image shape, an average of DC components of a luminance component (Y) and a DC component of each of chrominance components (Pb, Pr) are obtained for each macro block, and the average and the DC components are used as feature data of color information and brightness information represented by the content
In still another embodiment of the invention, each of the compressed contents includes a plurality of macro blocks representing an image shape, motions of an object represented by macro block motion information are read to obtain an average of the motions of the object, and the average is used as feature data of motion information of the object represented by the content.
In still another embodiment of the invention, each of the compressed contents includes a plurality of macro blocks representing an image shape, DC components and AC components of a luminance component and DC components and AC components of a chrominance component of an object represented by the macro blocks are read, and averages of the respective components are obtained and used an feature data of texture information of the object represented by the content.
In still another embodiment of the invention, each of the compressed contents includes frames representing sound, LPC coefficients recorded for each frame are read, and an average of the LPC coefficients is obtained and used as feature data of tone information represented by the multimedia content.
In still another embodiment of the invention, each of the compressed contents includes frames representing sound, spectrum normalization coefficients recorded for each frame are read, and an average of the spectrum normalization coefficients is obtained for each predetermined time period and used an feature data of tone information.
In still another embodiment of the invention, each of the compressed contents includes frames representing sound, a prediction residual/recorded for each frame is read, and the prediction residual is used as feature data of rhythm information.
In still another embodiment of the invention, each of the compressed contents includes frames representing sound, a frequency component after spectrum normalization performed for each frame is read, and the frequency component is used as feature data of rhythm information.
In still another embodiment of the invention, each of the compressed contents includes frames representing sound, LPC coefficients recorded for each frame are read, and a temporal change of the LPC coefficients is used as feature data of melody information.
In still another embodiment of the invention, each of the compressed contents includes frames representing sound, spectrum normalization coefficients recorded for each frame are read, and a temporal change of the spectrum normalization coefficients is used as feature data of melody information.
In still another embodiment of the invention, each of the compressed contents includes a plurality of objects, an object description recorded for each object is read, and a frequency of appearance of a word, as well as a frequency of appearance of a combination of a word and a preceding or following word, used in the object description are used as feature data of word information.
According to another aspect of the invention, a multimedia data retrieval method is provided. The method includes the steps of: storing a plurality of compressed contents; inputting feature data via a client terminal; reading feature data extracted from the compressed contents and storing the feature data of the compressed contents; and selecting feature data approximate to the feature data input via the client terminal among the stored feature data, and retrieving a content having the selected feature data from the stored contents.
Alternatively, the multimedia data retrieval device of this invention includes: a content storage section for storing a plurality of contents; a client terminal for inputting a feature description text; a feature data storage section for reading feature data of the contents from the content storage section and storing the feature data of the contents; and a content retrieval section for extracting a keyword from the feature description text input via the client terminal, converting the keyword into feature data, selecting feature data approximate to the feature data of the keyword among the feature data stored in the feature data storage section, and retrieving a content having the selected feature data from the content storage section.
In one embodiment of the invention, the content retrieval section includes a keyword dictionary for converting a keyword into feature data, and the keyword extracted from the feature description text is converted into the feature data using the keyword dictionary.
In another embodiment of the invention, the content retrieval section extracts a major part of speech from the feature description text to be used as a keyword.
In still another embodiment of the invention, the content retrieval section uses shape information of a content as the feature data.
In still another embodiment of the invention, the content retrieval section uses color information and brightness information of a content as the feature data.
In still another embodiment of the invention, the content retrieval section uses motion information of a content as the feature data.
In still another embodiment of the invention, the content retrieval section uses texture information of a compressed content as the feature data.
Alternatively, the multimedia data retrieval method of this invention includes the steps of: storing a plurality of contents; inputting a feature description text via a client terminal: reading feature data of the contents and storing the feature data; and extracting a keyword from the feature description text input via the client terminal, converting the keyword into feature data, selecting feature data approximate to the feature data of the keyword among the stored feature data, and retrieving a content having the selected feature data from the stored contents.
Thus, the invention described herein makes possible the advantages of (1) providing a multimedia data retrieval device capable of retrieving a content at high speed using high-level expression, and (2) providing a retrieval method for such a device.
These and other advantages of the present invention will become apparent to those skilled in the art upon reading and understanding the following detailed description with reference to the accompanying figures.