In recent years, the fantastic growth of the Internet and other computer networks has fueled an equally fantastic growth in the data accessible via these networks. One of the seminal modes for interacting with this data is through the use of hyperlinks within electronic documents.
Hyperlinks are user-selectable elements, such as highlighted text or icons, that link one portion of an electronic document to another portion of the same document or to other documents in a database or computer network. With proper computer equipment and network access, a user can select or invoke a hyperlink and almost instantaneously view the other document, which can be located almost anywhere in the world. Moreover, the other document itself can include hyperlinks to yet other documents that include hyperlinks, allowing the user to “hop” around the world from document to document to document seeking relevant information at will.
More recently, there has been interest in hyperlinking documents to other documents based on the names of people in the documents. For example, to facilitate legal research, West Publishing Company of St. Paul, Minn. provides thousands of electronic judicial opinions that hyperlink the names of attorneys and judges to their online biographical entries in the West Legal Directory, a proprietary directory of approximately 1,000,000 U.S. attorneys and 20,000 judges. These hyperlinks allow users accessing judicial opinions to quickly obtain contact and other specific information about lawyers and judges named in the opinions.
The hyperlinks in these judicial opinions are generated automatically, using a system that treats first, middle, and last names; law firm name, city, and state; and court information as clues to link named attorneys and judges to their corresponding entries in the professional directory. See Christopher Dozier and Robert Haschart, “Automatic Extraction and Linking of Person Names in Legal Text” (Proceedings of RIAO 2000: Content Based Multimedia Information Access. Paris, France. pp. 1305-1321. April 2000), which is incorporated herein by reference.
Although the automated system is highly effective, the present inventor recognized that it suffers from at least two limitations. First, the system exploits structural (organizational) features in judicial opinions, such as case headers, that are not common to other documents and thus limits its general application to other types of names and documents. Second, the system treats all names as equally ambiguous, or equally common, when, in fact, some names are more or less ambiguous than others. For example, the name David Smith is more common than the name Seven Drake and thus more ambiguous, or more likely to identify more than one person.
Accordingly, the present inventor has identified a need for other methods of generating hyperlinks for names, or more generally associating data that include names.