The present invention relates to video data processing, and more particularly for a coarse representation of a visual object's shape for search/query/filtering applications.
With the success of the Internet and picture and video coding standards, such as JPEG, MPEG-1, 2, more and more audio-visual information is available in digital form. Before one can use any such information, however, it first has to be located. Searching for textual information is an established technology. Many text-based search engines are available on the World Wide Web to search text documents. Searching is not yet possible for audio-visual content, since no generally recognized description of this material exists. MPEG-7 is intending to standardize the description of such content. This description is intended to be useful in performing search at a very high level or at a low level. At a high level the search may be to locate “a person wearing a white shirt walking behind a person wearing a red sweater”. At lower levels for still images one may use characteristics like color, texture and information about the shape of objects in that picture. The high level queries may be mapped to the low level primitive queries to perform the search.
Visual object searches are useful in content creation, such as to locate from archive the footage from a particular event, e.g. a tanker on fire, clips containing particular public figure, etc. Also the number of digital broadcast channels is increasing every day. One search/filtering application is to be able to select the broadcast channel (radio or TV) that is potentially interesting.
What is desired is a descriptor that may be automatically or semi-automatically extracted from still images/key images of video and used in searches.