Search systems that are currently in use on the World Wide Web (WWW) are either keyword search types or full-text search types. Since with such systems a very large number of search results are obtained, a great deal of effort must be expended before a target document for a user can be located. In attempts to resolve this problem, various corrective methods have been employed. According to one of them, a search request is submitted in the form of a sentence, not the logical product or the logical sum of several keywords, and a search is made using a sentence that resembles the sentence used to request the search. Technically, this method can be broken down into the following sub-methods:
(1) A vector space model method PA1 (2) A keyword location constraint matching method PA1 (3) A sentence matching method PA1 QT=TREE PA1 TREE=(WORD).vertline.(CHILD+HEAD CHILD*).vertline.(CHILD* HEAD CHILD+) PA1 HEAD=(`HEAD` TREE) PA1 CHILD=(FUNC TREE).vertline.(TREE FUNC) PA1 FUNC=`FN` WORD PA1 "XXX sha no YYY sha heno teiso" PA1 (((XXX sha)FN no)((YYY sha)FN heno)(HEAD(teiso))) PA1 "lawsuit of XXXCO. to YYYCo." PA1 (((HEAD(lawsuit))(FN of (XXXCo.))(FN to (YYYCo.))) PA1 order constraint . . . The positional relationship between CHILD and HEAD must be maintained. For example, when HEAD is located after CHILD, the relationship should be described as CHILD.fwdarw.HEAD. PA1 neighbor-order constraint . . . A HEAD word and an FN word in a NODE must be neighbors, while their positional relationship is maintained. Being neighbors means that these words are located within a distance delineated by a count of words that is equivalent to a numeral provided as a parameter. For example, when FNWORD is in the neighborhood that follows the NODE, the positional relationship is described by NODE.fwdarw.FNWORD. PA1 XXX sha.fwdarw.teiso PA1 YYY sha.fwdarw.teiso PA1 XXX sha.fwdarw.no PA1 YYY sha.fwdarw.heno PA1 lawsuit.fwdarw.XXXCo. PA1 lawsuit.fwdarw.YYYCo. PA1 of.fwdarw.XXXCo. PA1 to.fwdarw.YYYCo. PA1 search request: A . . . B . . . C . . . D . . . E . . . F PA1 syntactic tree: ((FN fn.sub.1 ((FN fn.sub.2 (A))(HEAD(B)))) PA1 document 1: . . . A . . . B . . . C . . . D . . . E . . . F . . . PA1 document 2: . . . A . . . B . . . C . . . D . . . E . . . F . . . PA1 document 3: . . . C . . . D . . . E . . . A . . . B . . . F . . . PA1 document 4: . . . D . . . C . . . E . . . A . . . B . . . F . . . ,
The vector space model method (1) ("Automatic Text Processing: the transformation, analysis and retrieval of information by computer," Salton G., Addison-Wesley Publishing, 1989) is a method whereby a document and a search request are respectively regarded as vectors, with their keywords acting as axes; and similarity is calculated by using the distance between the vectors. However, since with this method it is merely assumed that a keyword in a search request has appeared independently, this method can not be used to cope with a situation wherein a keyword in a search request just happens to be included in a large document.
The keyword location constraint matching method (2) ("Fast Method for Obtaining a Similarity for a Long Japanese Expression," Hideki Tanaka, reference material for the Language Processing Research Group of the Information Processing Institute, NLWG121-10, 1997) is a method for extracting keywords from a search request, and for defining, as matching, those keywords that satisfy a total-order relationship concerning the locations at which the keywords appear. This method is superior to method (1), but is inferior to method (3), in that only the locations of the keywords are used as constraints.
Method (3) is one for analyzing a search request and a document and for obtaining a match at a syntactic tree level. Although this appears to be an ideal method, the accuracy and the speed that are attained with it are unsatisfactory for syntactic analysis. Therefore, it is not widely employed.
It is therefore one object of the present invention to provide a search method and system for maintaining a balance between the accuracy and the speed attained during syntactic analysis.
It is another object of the present invention to provide a method and a system for performing an efficient network search.
It is an additional object of the present invention to provide a search method and system that does not syntactically analyze a document to be retrieved.
It is a further object of the present invention to provide a search method and system for the employment of location constraint data in a search request sentence.