1. Field of the Invention
The present invention relates to an information retrieval system. More particularly, it relates to an information retrieval system with a neuro-fuzzy approach for building a customizable thesaurus, tolerating an operator's input errors, performing fuzzy queries, and providing a quick select option by prioritizing the information retrieved by the system according to the operator's preference.
2. Description of the Related Art
The rapid growth in the Internet and information industry has diversified the way information is retrieved. It has become crucial for an operator to effectively find the information of his or her interest. For example, a software programmer who develops software programs with object-oriented technology may hope to quickly assemble a prototype software application directly from the appropriate retrieved and developed software components and test it after acquiring the formal specification upon receipt of demands for large scale software development. Such demands also hold true in public library query system, World Wide Web search tools and other information services (such as the management and query of massive news archives and legal documents) where operators hope to get the desired information that meets their professional needs.
However, the exponential growth in information loads or software components included in software component libraries has led to uncertainty in the entire process of retrieving information and software components. Efficiency is normally contingent upon an operator's experience and expertise, that is, whether the query words the operator inputs precisely describe the features of the desired information. Consequently, information retrieval systems have become a critical issue in information management and information recycling technology.
Conventional information retrieval systems are constructed based on Boolean logic. That is, the query words entered by an operator are matched with specific keywords in the information or software components to determine whether they meet operator's needs. Such conventional information retrieval systems offer very limited flexibility and have the following drawbacks:                1. In a typical information retrieval system, the system normally expects the query words an operator enters upon accessing query libraries to fully match the keywords provided by the system. In other words, those keywords are formulated to describe the features of an information archive or software component when the system is built. However, in practice, it is nearly impossible to demand that operators precisely use the keywords stored in the system.        
To tackle this problem, a thesaurus is created for keywords in the systems of the prior art. That is, each query word in the system is defined by a group of synonyms in the thesaurus. It can therefore be inferred that an input word is a query word when one of the synonyms is entered. However, a thesaurus as such generally adopts a rigid formula to list the correspondence between query words and synonyms.                2. In conventional information retrieval systems, it is assumed that query words entered by the operator should be accurate, i.e., conventional systems tolerate no erroneous input. Perfect accuracy cannot be guaranteed for routine input, and synonymous words with a different part of speech may be entered. For example, an operator may skip a character in typing a query word, or the operator may enter a noun form of the query word instead of the verb form as found in the thesaurus. Similar situations will prohibit the operator from finding a corresponding keyword.        
To tackle this problem, prior art has created a replace table to make up the deficiency. The replace table lists occurrences of possible errors, declensions and inflections of the query words in the thesaurus to revise potential input errors the operator may make. However, this kind of replace table is quite rigid and basically unable to compensate for all possible errors.                3. Information retrieval systems of the prior art fail to process indefinite information, that is, fail to provide fuzzy query processing. In other words, the operator has to precisely specify various descriptors or features of the query for information archives or software components to obtain the desired information. That is to say, the operator has to have certain knowledge of the information document or software components to be queried; otherwise, the queried result from erroneous information will not fulfill expectations.        4. Lastly, a significant number of documents or software components can normally be retrieved for a specific keyword in conventional information retrieval systems. Generally, advanced information retrieval systems may find the correspondence between documents or software components and query words and output the ranking. Since the ranking-filtering functions are rigid, every operator has to follow the same ranking-filtering procedures. Consequently, no flexibility is provided in using the systems.        
Current information retrieval systems, as specified above, can perform basic query and retrieval tasks; however, they are not conveniently accessible and have difficulty finding appropriate files or software components. And this problem is what this invention intends to resolve.