1. Field
The present specification generally relates to methods for linking documents and, more particularly, to methods for the creation of metadata for the storage of information relating to semantically-linked documents of a document corpus.
2. Technical Background
Documents within a corpus are often linked together by citations. For example, legal documents and scientific articles often cite to previous works to support a particular rule, proposition or finding. In the legal corpus context, an author of a judicial opinion often cites previous cases in support of his or her own legal statement or rule. In turn, these cited cases have themselves also cited and/or been cited by other cases in support of the proposition-in-question (and so on). Therefore, selected documents within the corpus are intrinsically linked together around particular issues, and these links can be manifested in the form of citation networks.
Researchers often search the corpus for documents that discuss a particular issue or topic. They will use the citations to move forward and backward within the corpus to find additional relevant documents. However, documents, such as legal documents, may discuss many different topics or legal issues. Further, a document may cite a document for many different reasons. Two citations pointing to the same document may cite to the same document for different reasons. Currently, the researcher does not know the particular issue or topic that a citing document is citing a cited document for based on the citation alone. The researcher must therefore sift through the many different cited documents.
Accordingly, a need exists for alternative methods of linking documents within corpus of documents such that the documents are linked semantically at the passage level and information regarding the semantic links may be easily and efficiently stored and accessed.