The invention disclosed herein relates generally to natural language data processing and, more particularly, to a computer based method for extracting open-issue data from textual specifications.
Sophisticated techniques for archival of electrical signals representative of natural language data allow most business organizations and government agencies to store vast amounts of information in their computer systems. However, regardless of how sophisticated the archival techniques become, the stored information is virtually worthless unless such information can be extracted when requested by an individual user. As used herein, open-issue data refers to incorrectly or inadequately specified data in a textual specification. For example, a textual specification is generally not the medium to find an interrogatory sentence (i.e., a question). If the textual specification does contain a question, there is a high probability that the presence of such question is indicative of an open issue which the user must resolve.
Typical manual techniques for extracting open-issue data incorporated in a natural language textual specification, such as a "Systems Requirements Document" or a "Proposal Request Document" are in general labor intensive and error-prone. Such techniques generally involve any number of highly experienced users who must thoroughly read and understand the content and intent of the document, and ultimately reach a consensus with regard to the open-issue data listed in such textual specification. The problem is additionally compounded because the selection of what is or is not an open issue is often highly subjective, and the criteria for selecting such open issues is likely to vary from user to user, or even for the same user over time.
Alternative techniques have been suggested which have attempted to facilitate to the user the retrieval of specific passages of text and then allow the user to declare that an open issue has been found. Unfortunately, such techniques at best only provide an interface between the user and rite textual specification and thus such techniques do not resolve problems of inconsistencies being that the user must make a subjective decision in order to declare that an open issue has been found.
It is therefore an object of the present invention to provide an improved method for extracting open-issue data which is not subject to the foregoing disadvantages of existing open-issue extracting techniques.
It is another object of the present invention to provide a computerized method for consistently extracting open-issue data so that such data can be easily reviewed, adjudicated and managed by the user and thus improving productivity as well as quality in the extraction of open-issue data.
It is a further object of the invention to provide a method for automatically extracting open-issue data based on predetermined linguistic analysis performed upon a text corpus representative of a predetermined natural language.