A question answering system is a system which deals with natural language statements and/or questions. The question answering system of the present invention uses fuzzy logic concepts. Fuzzy logic is described for example in Michael Smithson's work entitled Fuzzy Set Analysis for Behavioral and Social Sciences, published by Springer-Verlag, New York Inc., 1987.
The proposed system relates in particular to database answering question problems existing during the design of computer systems, and it is in depth described by the following. During the different design, test, and release phases of the development of a computer system, a number of databases, in the form of libraries, are developed and maintained for a variety of purposes such as error tracking and bookkeeping. To understand and improve the development process, previously developed databases may be used at a later date as representatives of the entire, or as representative of part of the process. We have analyzed these with the intent to develop algorithms and tools for future use. If a database has been developed with a particular purpose in mind, then it can be used for future studies in its entirety because the objective of the database was specified a priori. For example, if a library has been developed to report the functional errors discovered during the hardware design of a system, then such a library can be used in its entirety at the end of the development for a study concerning functional errors in hardware design.
However, on a number of occasions, it may be the case that a developed database needs to be used for a different purpose than originally anticipated. Such a necessity may arise for a variety of reasons including unanticipated studies. For example, a library that is created at or with the beginning of the development cycle for bookkeeping purposes contains information regarding the history of the logic design, and possibly it may contain information related to functional testing studies. If it is assumed that the database is accessed for routine bookkeeping functions, then when a functional error is discovered it would be desirable to be able to correct the error. Additionally, assuming its applicability is granted, such a database may be considered as more representative for functional error studies than an error tracking library, if the latter has been developed at the integration phase of the system development.
When a database is suspected of containing pertinent information to a process, it is possible to presume that the entire data base is pertinent to the intended application. An example of this can be found in An Application of Cyclomatic Complexity Metrics to Microcode by Timothy McNamara, TR 01.A517, IBM Corporation, Systems Product Division, Endicott, N.Y. However, this presumption may not be a good choice in most circumstances because such a database was not developed to accommodate the application of interest, and it may result in erroneous conclusions regarding the development process. In essence, it is advisable to investigate the suspicion regarding the relevancy of the database to an intended application. In order to assess whether a database contains information pertinent to a subject of interest, it would be desirable to have a tool that provides the capability to assess the validity of the decision, and when it is found that the library is pertinent to the subject of interest, to exclude irrelevant library entries. We know of no tool like the one we developed which provides the appropriate functions.
Since databases which emerge during the development of a computer system generally contain "comments", the previous database issues for such systems can be addressed by the examination of such comments.
A validation methodology could be developed using probability theory; however, this approach may not be the most appropriate. This is because the validation of the database must be carried out from the comments of the database, which are written in a natural language, through the use of some form of common reasoning. A consequence of the previous statements is that a probability approach may not be the most appropriate for this type of applications because, as indicated by G. Klir in his paper "Is There More to Uncertainty than some probability Theorists might have us Believe?", published in the International Journal of General Systems, Vol. 15, No. 4, April 1989:
An attempt to reduce the non-specificity or fuzziness inherent to the natural language descriptions may be unwarranted, and PA1 "It is not clear to me how probability theory could effectively describe and manipulate the great variety of descriptions or rules that are possible in natural language." PA1 A computer system may be developed in more than one country and consequently the comments of the databases may be written in more than one natural language, implying that more than one system needs to be developed. An example of this is the IBM 9370 computer systems which were developed in a number of countries having different natural languages including the United States and West Germany. PA1 Verifying the validity of a database is not the final purpose of the study. The development of natural language question answering systems may require a substantial effort and possibly, if the database is not appropriate to the application, it may need to be discarded. The concern here is not so much with the possible solution to the problem but rather with the implementation and development efforts, especially when it may be determined at a later phase of the project that the database does not pertain to the intended purpose of an application. PA1 A natural language question answering system may not be applicable to different databases and/or different types of investigations in its entirety. Consequently, at least a portion of the question answering system needs to be modified and/or expanded to reflect the desired application. PA1 1. "Fuzzy Techniques in Pattern Recognition," (book) by A. Kandel, published by John Wiley and Sons, in New York, 1982; PA1 2. "Fuzzy Set Analysis for Behavioral and Social Sciences", (book) by Michael Smithson, published by Springer-Verlag Inc, in New York, 1986; PA1 3. "On the Precision of Adjectives which Denote Fuzzy Sets" by M. Kochen and A. Badre published in the Journal of Cybernetics, Vol. 4, No. 1, January 1974 PA1 4. "Modelling Membership Functions" by P. Zysno published in Empirical Semantics, Vol. 1, in January 1981.
A different approach to the validation of a commented database might be to develop a natural language question answering system. While it may be entirely possible to develop such a system, possibly with use of fuzzy relations, such a solution may not be the most advantageous for a number of reasons, including the following:
As a consequence of the examination we made of possibilities as just outlined, we believe there is an apparent unfulfilled need for a new tool that will expedite and facilitate the evaluation of a commented database with minimal effort. Such a tool should be able to be applied in a variety of circumstances with minimum additional development effort in its use. Such a tool will most certainly allow for more time to be exerted on the analysis of the database rather than the assessment of its applicability. Moreover, as we conceive of it, if it is assessed that the system is not accurate enough to guarantee a reasonable exclusion of comments, the system could then be used as an indicator of compliance of a database to a pre-specific application, and its capabilities may be extended with the utilization of a natural language question answering system to further investigate the relevancy of the database.
If such a tool had existed, one having the capabilities of the tool we have created, then a natural language question answering system need be implemented only in the case when a more accurate analysis is required. Consequently, the development of such a natural language question answering system will take place only when it is needed rather than a priori, as it would be without our tool.
The solution which we provide is a question answering system based on fuzzy logic. The tool provides a quick assessment of the applicability of a database to a specified universe of discussion and the exclusion of the irrelevant comments of the database. In making available this new tool, we have employed for the purpose of background discussion in our detailed description which follows a few publications, including those mentioned at the beginning of this application and the following:
In the following sections, we provide a brief description and an intuitive justification and reasoning for the development of the fuzzy question answering system. The concept of degree of confidence, as it relates to words and comments within a database, is then formally defined and formulated. In the subsequent discussion, the fuzzy evaluator algorithm is described, and its capabilities are discussed.