Field
The present specification generally relates to annotating electronic text documents and, more particularly, to discerning the proper meaning of ambiguous terms and annotating those ambiguous terms with the proper meaning by use of a controlled vocabulary structure.
Technical Background
Electronic text documents may be annotated with information. Annotations may be provided in metadata, for example. Markup languages, such as XML, may be utilized to provide additional information regarding an electronic text document beyond the original text. In some cases, an electronic text document is annotated with information regarding the subject matter discussed within the electronic text document.
Words and phrases within electronic text documents are often annotated with meanings as defined by a controlled vocabulary, such as a thesaurus. Such annotations may assist in classifying the electronic text document or otherwise grouping the electronic text document by subject matter. However, the meaning of many terms may be ambiguous because one term may have many different meanings. For example, the term “Hampshire” may mean a breed of swine, and a county in England, among others. Annotating a term in an electronic text document with a meaning of the controlled vocabulary that is not the meaning intended by the electronic text document is problematic. It may be difficult to automatically determine when and how to annotate ambiguous terms within an electronic text document.
Accordingly, a need exists for alternative computer-program products and methods for discerning the proper meaning of ambiguous terms and annotating those ambiguous terms with the proper meaning.