1. Field of the Invention
The present invention relates to the field of data entry and retrieval and, more particularly, to a method and system for indexing annotations made for a variety of heterogeneous data objects.
2. Description of the Related Art
An annotation system is used to create, store, and retrieve descriptive information about objects. Virtually any identifiable type of object may be annotated, such as a matrix of data (e.g., a spreadsheet or database table), a text document, or an image. Further, subportions of objects (sub-objects) may be annotated, such as a cell, row, or column in a database table or a section, paragraph, or word in a text document. Some annotation systems store annotations separately, without modifying the annotated data objects themselves. For example, annotations are often contained in annotation records stored in a separate annotation store, typically a database. The annotation records typically contain information about the annotations contained therein, such as the creation date and author of the annotation, and an identification of the annotated data object, typically in the form of an index.
An indexing scheme is typically used to map each annotation to the annotated data object or sub-object, based on the index. Therefore, the index must provide enough specificity to allow the indexing scheme to locate the annotated data object (or sub-object). Further, the indexing scheme must work both ways: given an index, the indexing scheme must be able to locate the annotated data object and, given an object, the indexing scheme must be able to calculate the index for use in classification, comparison, and searching (e.g., to search for annotations for a given data object).
Databases are typically used as the annotation store for performance reasons, so that annotation records can be efficiently stored and retrieved. Therefore, the annotation indexing scheme should be designed so that the annotation records can be efficiently indexed, for example, taking advantage of existing indexing technology that utilizes database keys for indexing. A database key is a unique identifier for an entity in a database table (e.g., a social security is often used as a database key). To enable searching, sorting, and comparisons (necessary to query an annotation database), it is generally a requirement that database keys have a homogeneous content and attribute count (i.e., the same number and type of parameters).
However, a problem arises when the annotations must reference a variety of different (i.e., heterogeneous) types of objects, which is fairly common in modern business enterprises. For example, an annotation system for a biomedical enterprise may need to annotate documents, experimental data, genomic data, images, and the like. The problem arises because each of these different types of objects has a different way of identifying itself, and may also have a different number and type of sub-objects, resulting in different types of indexes for each. For example, a database table may be indexed using four parameters (location, table, row, and column), while a text document may be indexed using five parameters (location, file, section name, paragraph, and word). Thus, ideally, the indexing method for each type of object would be allowed to be different.
Given that an annotation system may need to index a variety of different data objects having a variety of different identifying parameters, the requirement to be able to index and search the annotation store seems in opposition to the conventional database indexing requirement that database keys used for indexing have a homogeneous content and attribute count. Accordingly, there is a need for an improved method for indexing annotations, preferably that allows for the flexible identification of a variety of different type annotated data objects.