1. Field of the Invention
The present invention relates to the field of data entry and retrieval. More particularly, the present invention relates to methods and systems for anchoring annotations a data element described by an annotation independently from a data source in which a particular data elements appears.
2. Description of the Related Art
Software applications such as relational databases, text documents, flat files, and the like are well known tools for capturing and storing data and information. Generally software applications store explicit knowledge, i.e., the actual data values obtained from scientific experimentation the contents of a document such as a letter, research paper, spread sheet, database entries, and the like. Often, such data is analyzed by various parties, e.g., experts, technicians, managers, scientists, researchers and the like, resulting in rich interpretive information, commonly referred to as tacit knowledge. Oftentimes however, tacit knowledge is only temporarily captured, e.g., as cryptic notes in a lab notebook, discussions among various parties, presentations, instant-messaging exchanges, emails and the like. Thus, tacit knowledge is often lost, because the application environment in which the related data is viewed and analyzed only captures the explicit knowledge.
One approach to capture tacit knowledge more permanently is to create annotations containing descriptive information about data elements that appear in a data source. Practically any identifiable type of data may be annotated, including spreadsheet content or database tables, a text document, or multimedia files. Further, sub-portions of data may be annotated, such as a cell, row, or column in a database table or a section, paragraph, or word in a text document. An indexing scheme is typically used to map each annotation to the annotated data, based on identifying information stored in an index. An index of annotations should be specific enough to locate the data element stored in a particular data source corresponding to the annotation listed in an index. Further, to be effective, the indexing scheme should function both ways, that is, given an index, the scheme must be able to locate the annotated data and, given a discrete data element, the scheme must be able to calculate an index value used to classify, compare, and search the annotations.
Typically, an index references the particular data source where the data element corresponding to the annotation appears, e.g., a text-document, spreadsheet, database table, and the like. Thus, using the index, an annotation may be retrieved using the application used to manipulate the data source and map the annotation to the annotated content therein. Oftentimes however, an interesting (and therefore likely to be annotated) data element may appear in a variety of application programs. For example, in a biomedical enterprise, a single data element, such as a gene name or locus, may appear in text documents (manipulated by a word processor/text editor), experimental data (manipulated by a database or spreadsheet application), genomic data (manipulated by a specialized application), images (manipulated by an image viewing application), and others. In many cases, an annotation made for the data element may be useful to users viewing the data regardless of the application being used. In such cases, it is desirable to allow the annotation to be “anchored” to the data element such that it capable of being retrieved and viewed from any application used to view data that includes the data element.
Managing annotations created for a data element (referred to herein as “global annotations”), as opposed to annotations created for an instance of the data element in a specific data source (referred to herein as a “document-based annotations”), creates several challenges for an annotation management system. First, although a data element may appear the same to users across a variety of applications—because the data element is meant to impart the same substantive information—different applications use different methods to represent data internally. Thus, to support global annotations, an annotation system should be configured to identify data elements independently from an underlying application type. Also, an annotation system may anchor some annotations to a particular data source. Accordingly, the annotation system (or application) needs to distinguish between annotation types when creating, viewing, accessing, and retrieving global annotations and document-based annotations.
Challenges also arise displaying and processing global annotations. First, annotations created for a particular data element may lose some contextual sensitivity. That is, a data element might be commonplace in one data source and interesting in another. For example, the term “DNA” is likely to be mere background in a paper describing an aspect of gene translation and transcription, but of central importance in a different paper describing biological computing techniques using synthesized DNA sequences to encode data. Also, one data source may contain a significant number of annotated data elements, not all of which may be relevant in context. Displaying annotations, or an indication thereof, for all of the data elements appearing in a particular data source may, therefore, reduce the overall usefulness of the annotation system. Consider an analogy: highlighting an entire page of text subverts the purpose of highlighting: to call attention to interesting sections on the page. Thus, if overused, the presentation may overwhelm the message. Finally, the additional computational overhead required to manage these challenges may decrease the performance of the annotation system below many users' tolerance levels, discouraging them from using the annotation system to capture and retrieve tacit knowledge, in the form of annotations.
Accordingly, there is a need for methods and systems of creating and managing annotations that are anchored to the data elements they describe, such that a contextually meaningful annotation may be retrieved and viewed from any application displaying the corresponding data element. Additionally, such methods and systems should provide for an efficient and scalable annotation system, in order to gain adoption by end users.