The generation of legal, commercial, business, and other documents is often based on standardized forms or on previously drafted “model” or “precedent” documents. The degree of reliance on such model or precedent documents ranges from the use of pure “fill-in-the-blank” type forms, to more free-form text based on a precedent document, but edited in a word processor. In either case, there are two principal techniques used to generate a commercial or legal document. The first is to “mark up” a precedent document taken from a similar transaction or relationship. The second is to begin with a standard form and customize the form to the particular circumstances. Both techniques allow, words, phrases, sentences, paragraphs or whole sections to be copied and pasted from other precedents or forms, as well as modified with insertions or deletions from the precedent or form.
The use of standardized forms and precedents is desirable because considerations of completeness, accuracy, legal certainty, and efficiency demand the highest degree of standardization consistent with the particular circumstances. The use of precedent documents not only renders drafting, reviewing, and proofreading less expensive, but also allows readers familiar with the precedent to more quickly and efficiently interpret the documents. While many different expressions of the same substantive terms are possible, use of a form or precedent allows the drafter to take advantage of the thought and effort embodied in the previously drafted from or precedent to avoid errors, omissions, and irrelevancies.
Whether the drafter uses a standard form or a precedent document, the state of the art may have drawback. The precedent method allows the drafter to select a document from a prior similar transaction, reducing some instances of omissions and irrelevant or incorrect inclusion of text. However, the use of such a document may expose the drafter to duplicating drafting errors in the prior document, including text that was uniquely appropriate to the particular circumstances of the prior transaction, and inadvertent omission of text that was appropriate to omit only for the prior transaction. Even documents that are prepared by expert drafters may contain errors, irrelevancies, and omissions that could be ascertained if the standardized form on which the document is based were known. Furthermore, the choice of a precedent document may be uncertain without an objective measure of its degree of standardization and conformity to industry practice.
The use of a truly standardized form, on the other hand, may allow the drafter some measure of confidence (depending on the source of the form) that the language used is reasonably representative of a standard document. Drawbacks of a standardized form may include the resources and expertise necessary to develop the form, its inflexibility in the context of varied circumstances, and the lack of guidance as to customization for particular circumstances. Notes or instructions to a standard form can provide some guidance as to a few major alternative provisions, but may not describe all of the relevant considerations for all but the simplest of forms. Further, even the largest organizations may have the resources only to develop a limited number of standard forms, and many of those forms may require ongoing updating, correction, and revision as requirements or industry customs change.
The wide variety of possible document forms and the individual deviations from those forms may also pose difficult problems for both manual and automated analysis and interpretation of documents. From the perspective of third parties who need accurate, reliable and timely analysis of business and legal documents, the automated analysis of documents may be a considerable advantage. By way of illustration, the Securities and Exchange Commission's Electronic Gathering Analysis and Retrieval system (EDGAR), which either directly or indirectly is the primary financial disclosure reference source for financial and legal professionals, contains millions of individual files, many of which contain a plurality individual documents themselves. On an average day, the EDGAR system receives over 1,000 additional filings, which can make manual analysis very costly.
In order to extract certain desired pieces of information from a complex document, the document must often be read manually. However, the document could be parsed and the relevant information extracted automatically if there were one standardized form for each type of document. In reality, however, there may be many such forms, and the forms themselves may be fluid, drifting from their original text as they are copied, adapted, recopied for another situation and adapted again. As a result, in order to automatically extract desired information from particular locations in a document, tens of thousands of possible forms may need to be manually classified and parsed, and computer routines unique to each individual form. Such a technique may only be economical for a few of the most frequently used forms, if any.
Therefore, the generation, classification, searching, and analysis of commercial, legal, and business documents may suffer from the absence of a means to identify standardized document text. In the context of both document generation and document analysis, the prior art may be insufficient to meet the foregoing needs. The prior art includes a number of systems designed to automate the generation of documents that adhere to standardized formats. U.S. Pat. No. 5,446,653 discloses a rule-based document generation system for constructing insurance policies in response to coverage information input by a user. U.S. Pat. No. 6,366,892 discloses a method for generating customized loan documents from a given database of standard provisions and optional provisions.
The prior art technologies allow the generation of a document from the basis of user-defined rules and templates. None of the prior art technologies known to applicant, however, appear to extract those rules and templates on the basis of prior documents. In addition, none of these prior art technologies appears to use these rules and templates for analyzing, as opposed to generating, customized documents. The foregoing and other problems may be solved by the system and method disclosed herein.