The present invention relates to computer software for database manipulation, and more particularly to a system and method for cascading search methodologies on selected sets of data from one or more electronic catalogs.
Searchable electronic catalogs are commonly used in support of various electronic commerce and purchasing functions. These catalogs must have a user interface for selectively retrieving data records. Engineers desire to make the user interfaces as simple as possible to operate, because complexity of the user interface can be a detriment to sales from the catalog. Simplicity becomes particularly important when the catalog is intended to be accessed by users with varying levels of skill or training. In particular, the results of the search should quickly and easily direct the user to the most desirable supplier or source for the requested goods.
User interfaces that are simple to operate should have the capability to handle almost any type of user input. In the case of an electronic catalog, if the user knows the exact part number of the desired product and enters the part number correctly into the user interface, then the database search engine will quickly identify the desired record from the database based on an exact match with the search string. In a more general case, the user may have only partial information about the desired product, or may incorrectly type the search string.
Similarly, the output of the search should be easy to understand. In an era when large accumulations of data are often available, there may be very large aggregations of catalog data in which to search and retrieve items. Ideally, a catalog search engine would have a mechanism for systematically searching through large electronic catalogs so that only the most relevant results are displayed to the user.
An over-abundance of catalog data can be problematic for at least two reasons: (1) the desired item may be available from many different suppliers, which creates a needlessly confusing array of output options for the user; and (2) computer system resources are expended to needlessly search for the desired item in the entire catalog database when a smaller, faster search would have uncovered the item from a preferred supplier. Managing the output options available to a user may be is particularly important in a corporate context in which individual employees are given the option of ordering their own supplies. In such a system, managers may wish to define a particular hierarchy of suppliers and enforce that hierarchy on users by only displaying the most desirable sources for items.
Previous systems have not adequately addressed the problems of searching large accumulations of catalog data and reporting the results in an efficient manner. Danish et al. in U.S. Pat. No 5,715,444 disclose a process for identifying a single item from a family of items in a database. A feature screen and search process present the user with a guided nonclassification parametric search to identify matching items based upon user specified criteria and priorities. Also disclosed are a method and system appropriate in an Internet environment.
Cochran et al. in U.S. Pat. Nos. 4,879,648 and 5,206,949 disclose a method of variably displaying search terms in which two control inputs are used to select a plurality of terms for a plurality of categories. A term in a visible position on the screen becomes a search term or a qualifier for other records in the database. The search results are dynamically formed on the basis of selected search terms. The search results can also be grouped in fixed or static lists.
More recently, Aalbersberg in U.S. Pat. No. 5,946,678 discloses a user interface for document retrieval in which each query word is displayed by means of a distinctive representation. In a subsequent results window, each document header or title is accompanied by an indicator which employs the same distinctive representation to directly indicate to the user the relative contributions of the individual query words to each listed document. The distinctive representation can take several forms, such as by a different color or by means of hatching or shading or by displayed icons.
Efficiently searching through an electronic catalog has been the focus of much effort. Prior catalog search algorithms typically employ one of two search strategies. The first strategy is a keyword search for selecting database records based on matching text strings. The second strategy is a classification search for selecting database records based on lists of classifications from which to narrow and select the database records. Each of the two search strategies has disadvantages that can make it difficult for users to find their desired database records.
The keyword search strategy has the disadvantage that users must be familiar with the appropriate key word terms that are likely to yield the desired data records. In addition, it is not always possible to quickly collect groups of logically related data records. If a close match is found, but it is not the desired exact match, it is not always possible to utilize the information in the close match to quickly identify all similar data records. A keyword search engine does not typically have a xe2x80x9cmore-like-thisxe2x80x9d function that operates on close matches to identify similar items within the database.
The classification search strategy can take advantage of a logical grouping of data records. This search strategy is best suited for finding data that break down logically into successively greater levels of detail. This search strategy is most effective when the data have been carefully edited and structured within a database. Finding a single relevant record can quickly lead to all other relevant records, as long as the grouping logic relates to the way in which the data are used. Thus, a xe2x80x9cmore-like-thisxe2x80x9d function can quickly identify all similarly classified records in the database.
The disadvantage of the classification search strategy is that users may not always anticipate the proper classification of certain records, and may search the wrong categories for their desired database record. The user is tied to the logical structure of the data, and must learn to navigate the predefined structure of the database in order to locate particular data records.
Whether a search is conducted by keyword or classification strategy, the focus is on finding a particular item. In some cases the item is available from more than one supplier. In other cases, there may be more than one different kind of item, available from more that one supplier, that will satisfy the user""s needs. In any case, it would be desirable to further refine the search methodology so that the most advantageous suppler is quickly identified to the user. It would also be desirable to avoid the computer processing time that would otherwise be needlessly expended on searching through less desirable supplier catalogs when the item has already been found.
It would be further desirable to have a simple user interface, both for inputting search terms and reviewing results. On the input side, the software should allow free-form text searching, with no prerequisites for format or content. Thus, it would be desirable to have a system capable of identifying the database records most likely to be the desired choice of the user, even when user inputs a search string having misspelled terms, word fragments, or other characteristics of the item being sought. On the output side, the software should only display the most advantageous sources for items, especially when the items are available from many different sources.
In many commercial situations, it would be advantageous to be able to configure the search behavior for a variety of factors. In addition to providing a simple user interface, it would be desirable to segment a database of searchable items into multiple tiers. The combination of search strategy and database segmentation would enable the identification of items from the most economical sources. It would also enable system managers to adjust the results based upon changing factors. Finally, such as system would efficiently use computing resources. These, and other technical and business aspects of catalog search engines, are the motivating factors for the invention that is described herein.
The present invention is a system and method for cascading search methodologies on preselected segments, or sets, of data. Each data set is paired with one or more search strategies so that the overall effect is to supply the user with the most advantageous match to a keyword search. Search strategies may include one or more of the following: exact search, stem search, soundex search, and fuzzy logic search. Data sets may be preselected based on source, shipping availability, or any-other business reason for choosing one supplier or source over another.
During a search, a user inputs one or more search terms to identify a desired item from an electronic catalog. The search engine of the present invention employs the designated search methodology upon its corresponding data set. The hierarchical order of the data sets is established by a system manager based on the desirability of procuring an item from a particular supplier or source. Once the item has been found, the search engine terminates its search, thereby saving the computing resources from needless searches through the remaining data sets.
In one embodiment of the invention, the system is configured to search first within a catalog (or data set) of items that are designated as in-house, and then to fail-over to a second tier catalog (or data set) of vendor-supplied items available for short-term delivery. If both searches fail to yield an acceptable result, the search engine may fail-over to special order suppliers with longer lead times for delivery. One advantage to the user is that the first search result will often be the most desirable option available.
The text searching can be improved through the use of sequential search algorithms that are designed to maximize the chances of identifying the desired data records. For example, several different search algorithms can be employed upon the most desirable data set to increase the chances of finding an appropriate item within that data set. For less desirable sources, it may be advantageous to only search for exact matches of the search term.
According to the present invention, a method of selecting data records in a catalog database comprises the following steps: inputting search terms to a user interface; testing the search terms against a sequence of data sets using search algorithms designated for each data set; and terminating the sequence of search algorithms when at least one database record satisfied the search criteria. In some embodiments of the invention, the algorithm may be expanded by compiling a unique list of classifications from each identified record to aid the user in further refining the search terms.
The invention comprises a database along with a search engine. The database may consist of an aggregate of supplier catalogs, in which each data record further consists of category descriptions, manufacturer""s name, manufacturer part number, short text description, and parametrically composed descriptions. Each of the items within the data record may be organized by fields.
The available search algorithms according to the present invention may comprise proximity searching, string matching, stemming, fuzzy logic, and soundex matching. In certain embodiments, multiple search algorithms may be performed on a data set. For example, if an exact match is found, the search halts when all exact matches have been identified, and there is no further recourse to other search algorithms. If no exact match is found, then the search terms are manipulated to identify strings with similar roots. If, again, no match is found, the search terms are tested further according to other algorithms, such as fuzzy logic and soundex, until a match is found or the search engine reaches its logical termination.
One of the aspects of the search strategy is that the searchable terms include the predefined classification terms as well as other attributes and parameters of each catalog entry. This means that the freeform text input will show text string matches against any classification name or parametric name. This feature enhances the possibility of finding the desired data record based on the keyword search engine.
Each catalog entry may have one or more associated classifications according to type, and a list of unified classifications may be compiled dynamically from the identified matches. Dynamic compilation refers to the process of continuously updating the list of classifications whenever new matches are identified within a data set. This insures that the list continuously and accurately reflects the range of classifications of the identified matches. The list is unified in the sense that each classification is listed only once, even when the identified matches have multiple records with the same classification. The classification list is presented to the user along with the list of matches as an aid to the user for further refining the search methodology.
The invention has the unique aspect of allowing dynamic searching of subsets or blocks of databases with a combination of any of multiple search methodologies supported by the software. A system manager can specify which data sets or catalogs are searched first and in which sequence they are searched. Each block or data set of a particular catalog can be searched with a different strategy. A particular combination of search methodologies can be assigned to a user of the invention by name and password. Data suppliers also have the ability to request special priority for the searching of their data type.
Those skilled in the art will recognize the benefits and objects of this invention, which include but are not limited to the following: providing a database search engine that can quickly and easily lead users to a desired database record; combining the benefits of keyword searching with the benefits of classification searching; providing an interface that will process any type of user entry, including misspelled words and word fragments; increasing the efficiency of the search process by first searching in the most desirable data sets; and providing a search engine and database structure that maximizes the likelihood of finding the desired database records based on a simple user interface.