1. Field of Art
The disclosure generally relates to the field of natural language processing, and in particular, to identifying and extracting information from documents.
2. Description of the Related Art
A contract is a document that defines legally enforceable agreements between one and more parties. During the negotiation process, parties to the contract often agree to make multiple amendments or addendums, and these amendments or addendums can be stored in random formats in differing locations.
Frequent changes in contracts often present challenges to conventional approaches for finding contracts and amendments, as conventional approaches typically focus on the unstructured text only and are not able to extract relevant and important information correctly. In addition, conventional approaches find information only at the document level and attempt to locate actual text, sentences or sentence boundaries to allow for indexing. These approaches do not facilitate recognition, extraction, and grouping of clause types.
Moreover, conventional approaches cannot identify standard clauses that are distinctive clauses or terminologies used in each contract. Furthermore, conventional approaches cannot identify non-standard clauses that are unusual variations of the standard clauses and that are no longer reflecting the meaning of the standard clauses. For example, a contract and amendments may include the standard clauses that contain wording such as “net 30 days,” “within 30 days,” “30 day's notice,” and “2% penalty.” On the other hand, one of the amendments may include the non-standard clauses such as “5 working days” with “60% penalty.” Without the ability to discover the non-standard clauses, any party not keeping track of the amendments or the addendums is vulnerable to a significant amount of risk of overlooking unusual contractual terminologies.
Accordingly, there is a need for a system that identifies and searches for non-standard and standard clauses used in contractual documents.