1. Field of the Invention
The present invention relates to information retrieval in a data processing system. The present invention further relates to a method for searching a document database such as the Internet and ranking the results obtained from such a search.
2. Description of Related Art
A computer's logic is both its strength and its weakness; it can only perform what it is told to do. If an alarm clock is set to go off at 6:00 PM, it will go off at exactly that time, even if it was obviously meant to go off at 6:00 AM. People in the real world can solve problems and make decisions relatively easy, but even the simplest decisions are often too difficult to be handled by computer. Fuzzy logic query processing helps to bridge the gap.
Databases are strategic tools because they support business processes. In order for a database to be useful, data must be compiled into information using tools such as queries. Queries allow a user to specify what data to retrieve from a database, and in what form. Fuzzy querying provides a way to retrieve data that was intended to be retrieved, without requiring exact parameters to be defined.
Non-fuzzy query processing relies on Boolean logic, which limits results to true or false (1 or 0). Fuzzy query processing is a superset of Boolean logic that can handle partial truths. Instead of a search results being limited to return a value of true or false, the query returns values as x % true or x % a member of a subset.
Fuzzy queries rely on the use of fuzzy quantifiers. Dr. Lotfi A. Zadeh, the founder of fuzzy logic theory, defined two kinds of quantifiers: absolute and relative. An absolute quantifier can be represented as fuzzy subsets of the non-negative numbers and use words such as at least three or about five. Relative qualifiers are represented as fuzzy subsets of the unit interval and use words such as most, at least half, or almost all.
Fuzzy queries do not take the place of the more structured queries, but expand the alternatives available. Boolean systems use selection and then ordering as a mechanism, where a fuzzy system relies on a single mechanism of overall membership degree. A fuzzy system allows for compromise between the various criteria, where a Booleen system can produce a subset of previously selected elements. There are times when Booleen logic is too rigid to be meaningful to a user. A fuzzy query allows a user to find elements that satisfy a criterion and ranks the results.
FIG. 1 illustrates the typical flow of a fuzzy database query. After identifying the need for a report 100, the user queries a database 110. The database returns a record, which is matched against predefined criteria 120 to determine the degree to which a match has occurred. The degree is then compared to a threshold value 125 to determine whether the record satisfies the users query 130 or whether the record should be discarded 135. In general, a fuzzy database query differs from a non-fuzzy query by adding steps to match the data to predefined criteria and compare the value to a threshold specified in the query.
An object can be a member of multiple sets with a different degree of membership. The degree of membership is a scale from zero to one. Complete membership has a value of one, and no membership has a value of zero. When running a fuzzy query in a control system, the output is calculated based on the value of membership a given input has in the configured fuzzy sets. Each combination of sets is configured to have a specified output. The output is based on the weighted sum of the amount of membership in each set. The fuzzy models may be used in conjunction with probabilistic models to find a solution.
FIG. 2 shows the three transformations of the system inputs 200 to outputs 205 in a fuzzy system. The process of “fuzzification” 210 is a methodology to generalize any specific theory from a precise form to continuous form. It decomposes a system input or output into one or more fuzzy sets. After the decomposition into fuzzy sets, fuzzy rule association 215 applies a set of rules to a combination of inputs. The rules determine the action and relate the variable into a numeric value. Once the numeric value is determined, de-fuzzification 220 converts the fuzzy result into an exact output value.
For example, telling a driving student to apply the brakes 74 feet from the crosswalk is too precise to be followed. Vague wording like “apply the brakes soon”, however, can be interpreted and acted upon. The instruction is received in a fuzzy form, the person associates the message using past experiences, then defuzzifies the message in order to actually apply the brakes at the appropriate time. Fuzzy queries expand query capabilities by allowing for ambiguity and partial membership.