1. Field
The present specification generally relates to methods for identifying specific issues discussed within corpus of documents and, more particularly, to methods for extracting and organizing such issues identified in the document corpus into a structured issue library.
2. Technical Background
Documents within a corpus are often linked together by citations. For example, legal documents and scientific articles often cite to previous works to support a particular rule, proposition or finding. In the legal corpus context, an author of a judicial opinion often cites previous cases in support of his or her own legal statement or rule. In turn, these cited cases have themselves also cited and/or been cited by other cases in support of the proposition-in-question (and so on). Therefore, selected documents within the corpus are intrinsically linked together around particular issues, and these links can be manifested in the form of citation networks.
Researchers often search the corpus for documents that discuss a particular issue or topic. They will use the citations to move forward and backward within the corpus to find additional relevant documents. However, documents, such as legal documents, may discuss many different topics or legal issues. Further, a document may cite a document for many different reasons. Two citations pointing to the same document may cite to the same document for different reasons. Currently, the researcher does not know the particular issue or topic that a citing document is citing a cited document for based on the citation alone. The researcher must therefore sift through the many different cited documents.
Additionally, for any research project, a researcher may typically only be interested in documents pertaining to a certain issue or issues. The research process is therefore frequently impeded when a researcher is presented with documents that are unrelated to the immediate issue(s)-in-question. Further, a manual backwards and forwards citation-searching based on citation may be a very time consuming endeavor when researching these selected issues within the corpus, as well as a process that leaves many users uncertain about whether they have actually explored the full-range networked connections in sufficient depth.
Accordingly, a need exists for alternative methods of extracting an organizing issues within a corpus of documents into an issue library that may be accessed to enhance document searching capabilities.