1. Field of the Invention
The present invention relates to image and video storage and retrieval systems.
Portions of the disclosure of this patent document contain material that is subject to copyright protection. The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure as it appears in the Patent and Trademark Office file or records, but otherwise reserves all copyright rights whatsoever.
2. Background Art
Computer systems are used to store large amounts of information and data. To be useful, it is important that the data be organized and searchable so that data and information can be easily found. For text data it is relatively easy to search for data by searching for key words that might be found in the text of stored documents. Thus, the stored data itself can be used as part of the searching effort. It is not as easy to search for images on computer systems because of the way that they are stored. Images are stored, in one example, as a series of pixels that indicate a particular color. There is nothing about the pixel that lets a searcher know if it is a part of a picture of a car or a bird. Thus, the image data itself has not been easily usable as part of the searching effort.
One method for making it easier to search for images is the use of captions or text descriptions associated with the image that themselves are searchable. For example, a picture of a car on a bridge could have a caption describing the scene with the car, bridge, background, etc. all described in text. When a person searches for an image, the person enters words that are then used to search through image captions. This scheme requires that each image be looked at and described by a human operator, a time consuming effort and one that adds to the amount of data needed to be stored with each image, so it is space consuming as well. This type of system is called a content-based retrieval system.
Another type of image and video storage and retrieval system uses a compressed domain approach. The compressed domain approach derives the image or video features from the transform coefficients, thus requiring decompression.
The problems associated with image indexing and retrieval systems can be better understood by a review of content-based retrieval systems and compressed domain systems.
Content Based Retrieval—Keyword Approach
One type of content based retrieval system uses keywords. Typically, keywords describing each image are recorded in text and associated with the image. (This additional data, which in part describes the image, is often referred to as “meta-data”). When a user wishes to retrieve the image, a keyword is typed and all of the images having that associated keyword are retrieved. This requires great human effort in creating the meta-data that enables visual queries. The text descriptions also do not completely or consistently characterize the content of the images and videos. Second, the relatively large data sizes of images and videos compared to the communication channel bandwidth prohibits the user from browsing or perusing all but a small portion of the archive at a time. Therefore, the ability to find desired images and videos depends primarily on the capabilities of the query tools provided by the system.
Content Based Retrieval—Query Approach
Using a content-based query, the user provides a description of some of the prominent visual features of an image or video. Then, a mechanism is enabled by which the computer searches the archive and returns the images and videos that best match the description. Typically, research on content-based queries have focused on the visual features of color, texture and shape. For example, the IBM Query By Image Content (QBIC) project proposes and utilizes feature sets that capture the color, texture and shape of image objects that have been segmented manually. Texture and color features are also utilized that describe the global features of images.
The keyword based and query based approaches to content based retrieval store the keywords or visual features in addition to the compressed imagery. This produces a data expansion, which is disadvantageous.
Compressed Domain Retrieval
The advent of compression standards has led to the proliferation of indexing techniques in the compressed domain. Many images and videos in a networked multimedia database are of a compressed nature. Compressed domain techniques seek to identify and retrieve the images by processing data in the compressed representation of the images. The main advantage of compressed domain processing is the reduction of computational complexity which results from the smaller size of the compressed data file.
Compressed domain techniques, however, derive the features of the images or videos from their transform coefficients. This requires the decompression of the bit-stream up to an inverse transformation step, which is disadvantageous. There is currently no approach that minimizes the data expansion associated with content based retrieval and also minimizing the decompression associated with compressed domain approaches.