1. Field of the Invention
This invention relates to the cataloguing multimedia data and including storage and retrieval mechanisms.
2. Background
Increasingly, computer systems are being used to present multimedia material. Such material is usually in the form of text, graphics, video, animation, and sound. Two or more of these data types are usually combined to form the multimedia data presented by the computer system. A computer system that is used to present multimedia material is called a multimedia system. A problem with prior art multimedia systems is an inability to search and retrieve multimedia data.
One prior art multimedia system uses a disk operating system that includes a file system for storing and retrieving files containing multimedia data. The file system catalogues the files based on the names given to the files. The file system can be used to retrieve a file that contains multimedia data based on the file's name. Other than the extent to which the file name identifies content, the file system does not provide the ability to retrieve multimedia information based on the content of the data. The search system provided by a file system is therefore inadequate to search with greater detail than that provided in a file name. A number of prior art multimedia systems are described below.
A search system is described in U.S. Pat. No. 5,241,671, Reed et al., issued on Aug. 31, 1993 relates to a multimedia system that includes a database that is comprised of words, phrases, numbers, letters, maps, charts, pictures, moving images, animations, and audio information. A search capability is provided that provides a series of entry paths for locating information in the database. An entry path allows a user to enter a search request that consists of a set of valid terms or stop terms. A stop term is a term that exists on a stop term list and may be the words "the" or "a", for example. Valid terms are linked to related terms using a stem index. A stem index contains a root term and a set of stems for each term that is related to the root word. For example, the word leaf is linked to terms "leaves" and "leafing".
The creating and displaying of navigators for locating and accessing visual/image information is described in U.S. Pat. No. 5,123,088, Kasahara et al., issued on Jun. 16, 1992. Image information is categorized and linked in a circular list and ordered according to their attributes. When an image unit is displayed, the linked image units can be displayed as reduced images, or navigators. A user can navigate through a network of linked image units by selecting the desired navigator.
A system for database retrieval wherein entries in different databases are retrieved by a process of matching key words of the databases is described in U.S. Pat. No. 5,210,868, Shimada et al., issued on May 11, 1993. Examples of two such databases are a mapping database and a customer attribute database. A dictionary is used to separate a keyword from a first database into common and proper noun subparts. Common and proper noun synonyms are inferred according to a set of rules. The synonyms are combined using a combination rule and then compared with keywords in a second database to generate a final matching result.
A system for handling multimedia using entity and relation objects is described in U.S. Pat. No. 5,278,946, Shimada et al. issued on Jan. 11, 1994. An entity object defines methods and properties for entities such as a building, road, railroad, and boundary. A relation object defines method and properties for relationships between entities. A user model and system model can be coupled to generate a digest of multimedia data.
A system for storing and retrieving digital images is described in U.S. Pat. No. 5,493,677, Balogh et al., issued on Feb. 20, 1996. A caption or other metadata can be associated with a digital image. A natural language capability removes ambiguities from the metadata input by a user prior to its storage. The natural language capability determines matches between a user query and the stored metadata. The system allows a user to select an image, review licensing terms for the selected image, and order the image.
A repetitive analysis event system that accesses data using a time-based number is described in U.S. Pat. No. 5,414,644, Seaman et al., issued on May 9, 1995. The system uses an information library that consists of visual data storage and a textual database for storing written descriptions of the visual data and a glossary of keywords that identify repetitive events or behavior. A behavioral label is used to define a behavioral activity. A series of images or video clips are associated with the behavioral label. A user can retrieve images by identifying a subject, a behavioral activity, or other type of descriptive text. A chronological timeline is used to control the order in which the images are displayed. That is, the images are displayed in sequential order using the time-based number.
A knowledge based information retrieval system is described in U.S. Pat. No. 5,404,506, Fujisawa et al., issued on Apr. 4, 1995. The system provides a visual interface for local searching and a natural language interpreter for global search. The natural language interpreter is used to infer the meaning of a noun phrase or a nominal phrase. The inferred meaning is used to retrieve information.
The search capabilities in the patents identified above do not provide an ability to catalogue multimedia data such that it is available for use across systems or applications. There is no ability to create a general catalogue and index for searching a catalogue that can be used for the storage and retrieval of multimedia data by multiple applications. The storage approach used in the prior art is designed to accommodate a particular system's needs. A number of other approaches are described, but in these too, the index capabilities are designed for use with a particular system.
An indexing capability designed for use with a hypertext nodal network is described in U.S. Pat. No. 5,408,655, Oren et al., issued on Apr. 18, 1995. The set of indexing terms generated using the hypertext nodal network are compared with each of the nodes in the database to identify a set of index terms for each node (i.e., document index terms). A set of index terms are associated with an option or criterion (i.e., option index terms) that can be user-selected from a menu. A hypertext nodal network is needed to use the indexing capability in this case.
An index is described in U.S. Pat. No. 5,121,470, Trautman, issued on Jun. 9, 1992 describes an interactive record system that automatically indexes data obtained from multiple sources. The invention has application in the medical care field. The data is indexed along one or more dimensions (e.g., time). Data events are identified for the indexed data by distinguishing sets of data into given intervals. Data objects are associated with the data events. The events and associated data objects are displayed. Actuators are associated with the data objects to allow the objects to be manipulated. Data events and dimensional criteria is needed to use this indexing scheme.
A system for identifying and displaying an image that is selected based on user input is described in U.S. Pat. No. 5,010,500, Makkuni et al., issued on Apr. 23, 1991. Gestures input using a mouse are used to identify an image having features that resemble the input. Multimedia data associated with a portion of the image can be activated by selecting the image portion. When a selection is made, a menu can be displayed for user selection. Data is indexed based on actual portions of images.
A system that creates an index for frame sequences in a motion image is described in U.S. Pat. No. 5,428,774, Takahashi et al., issued on Jun. 27, 1995. Each record in the index has an associated retrieval key. The initial and final positions of a frame sequence are designated in an index record. Records are retrieved from the index file based on the retrieval key. The retrieved records are arranged along a time axis based on the initial and final positions. Data (i.e., frame sequences of a motion picture) is indexed based on a time sequence of frames of the data.
A system that uses keywords to locate and retrieve higher level records is described in Kuga et al., U.S. Pat. No. 5,280,573, issued on Jan. 18, 1994. Each of a plurality of higher level records contain different types of information associated with a keyword. Such higher level records may contain usage, synonym, and meaning information associated with a keyword, for example.
A system for storing images and audio that can be used to create an audio-visual presentation is described in Beitel et al., U.S. Pat. Nos. 5,119,474 and 5,274,758, issued on Jun. 2, 1992 and Dec. 28, 1993, respectively. The system includes the following software components: library editor; image editor; digitize editor; convert editor; audio editor, and story editor. The image editor can edit an image (i.e., add text and graphics to an image). The digitize and audio editors convert analog data to digital data. The convert editor is used to convert images to a form that is usable by the system. Images and audio data are assembled into a presentation using the story editor. The library editor manages the storage, retrieval and processing of system objects (an object is an image, audio file or audio/visual presentation). The library editor maintains a library of files that contain an object.