With the rapid spread of Internet, that has being growing exponentially since the last two decades every part of human life and the activities surrounding it are now done through the Internet. Similar is the case for business. Previously while buying and selling of items people used to browse through huge paperback catalogs containing thousands of records and then take a decision. In order to search for a product of interest the person concerned has to first find out from the index or content page, the probable topics/categories in which product with that description might occur. Then he has to browse through each of the entries of that page to find the product of his need. He has to repeat the procedure for new topics if he gets no satisfactory results.
For the ease of the user to search through catalogs more and more companies are reverting to electronic catalogs. The user can search through the catalogs quickly and hence place an order for the product immediately. This saves lot of time and money.
But then our claim of searching through the catalogs efficiently and quickly, this is not 100% correct. Because there may be thousands of categories in the whole category hierarchy and each having catalogs of varying types, quality and manufacturer amounting to total of more than millions of catalogs or data items. Out of all these the user is interested in a very few specific records. And generally the only interface provided to the user is that of the ‘keyword search’. In this type of search the user types in certain keywords about the catalogs/categories or one that describes the product. And based on these few keywords the system ought to return to the user the most relevant catalogs/categories or data items of the user's interest. In this type of search it is a very difficult task for the system to retrieve only those items that are very specific to the user's interest.
Often the user is not quite sure of what his/her needs are. So in that case the user first types in certain keywords, gets an initial idea of the catalogs and then wishes to type in certain keywords more specific to the catalog of his/her need. But then the user wishes only to search for records within the chosen results of the first search. This feature is supported in many of the existing search engines by the name of ‘Search within Results’.
Most of the present day available keyword search engines follow very simple methodology of searching through the contents of the available records and retrieving those products whose description terms that matches that of the query terms. Let us consider the example where the user is looking for a pen and the product description just describes the color and the quality of the pen but nowhere does it tell that the description is that of a pen. In such cases most present day keyword search engines fail to deliver the correct output to the user. Instead in such a scenario the user would ideally like to have something that conceptualizes the query terms and extracts the context out of the same. And hence match this extracted context with the context of the products already available in the database.
Also a retrieval or search system is used by both advanced users and by novice users. Users can have technical or non-technical background in the search field. People designing such a system are obviously very technically sound, but people hardly pay any attention as to how much comfortable it is for the end user. Not always does one find an intuitive interface.
One major problem that strikes most keyword search system is that the keywords match with more than one relevant category, so the system ought to return to the user the results for each of these categories. When showing more than one category where each of them are relevant to the keywords entered to some extent, the user will be confused with so many results and the system needs to specify the importance of each of the categories with respect to the input keywords of the query. In other words the system needs to rank the categories according to some measure of relevance, but among the present available retrieval engines none of them does that in a very accurate manner.
The user enters keywords to get to a certain catalog product. Usually the keyword contains name or description of the catalog item and/or some properties of the product like its dimension or color or some other attribute value of the desired product. The attributes of one product can match the attribute value of another product, like color and dimensions of different products can be same. And based on these keywords entered the user gets the response. All the present day search engines available show either the impact of each of the terms in the document or category level. None of them provides an impact value for the keywords on the overall search procedure, something that will aid the end user to get an overall understanding of the terms in the corpus.
There are a variety of the same products available and each of which is described by a different catalog or data item, hence we may have the same catalog with many varying dimensions and many varying colors. In such a case the user have to browse through many different types of the same item before he/she can go on to the next item. All the present day search/retrieval systems available present to the end user all the catalogs of data items that have being retrieved by the system, without paying any heed to the user's viewpoint. Hence due to this the user has to spend a lot of time browsing through useless items before he/she can reach to the catalog or item of exact need. The user has to repeatedly browse through similar products by pressing the ‘NEXT’ button on the browser.
A major problem that is faced in the present day search engines is that extracted attributes cannot be readily used by the system.
In the present search engines, the user enters certain terms in the query. All the query terms are given equal weightage by the system when retrieving the relevant categories or the catalogs within it.
Generally the present systems available are not smart enough to look for different tenses and different forms of the keywords entered.
While searching electronic catalogs using ‘Keyword Search’ or by navigating among the categories or perform a parametric search where product classes have parameters, the user faces the limitation of the relevant data required usually not present amongst the initial matches found, variety of products with same keyword or product description are found. Also there is no graphical interface for the user to easily understand the impact of each word in a multiple keyword search.
The search engines are made to be used by users with varying degree of skill. But the problem is that they are not designed so as to facilitate an advanced, technical user as well as a novice. Also the keywords and the attributes of the products are given equal weightage by the system. This is a limitation in the sense that it restricts the user to look for a product more closer to his needs.
Significant amount of work has taken place in the last few years in the area of providing user friendly search engines for electronic catalogs in various forms and this is reflected in the existing web sites as well as the patents that exist in this field.
U.S. Pat. No. 6,012,053 provides for a computer system for performing searches on a collection of information includes a mechanism through which results from a search query are ranked according to user specified relevance factors to allow the user to control how the search results are presented. The relevance factors are applied to the results achieved for each query. That is, each item returned by the search has a set of attributes. Each of these attributes is assigned a weight according to the specified relevance factors. These weights are combined to provide a score for the item. Search results are provided to the user, ordered according to scores. But the invention has the limitation that the output provided is not in a very user-friendly manner. The user has to browse through the whole textual list in order to see the results of his search.
In U.S. Pat. No. 6,275,229 a method and apparatus for efficiently analyzing information on a computer is provided. The information contains information items where each information item has a plurality of attributes. The information is re-organized based on the attributes and displayed in graphical form on a computer display screen. By viewing the information in graphical form, a user can quickly analyze the information to determine trends or qualities and also allows the user to quickly identify the information items most relevant to specific criteria. Here the user can not assign weightage to the specific attributes in order to refine the search.
In U.S. Pat. No. 6,326,962 an enhanced graphical user interface using Venn Diagrams to take the input query from the user is defined. But it does not define any user-friendly manner to display the output.
U.S. Pat. No. 6,324,534 describes an electronic catalog search engine utilizing many search strategies. It also groups the products and allows the user to refine the search based on product attributes.
The disadvantage with the systems described above is that they are not very accurate and even if they hit upon the result the display to the user is not is in a format that is easily understandable by him.
Also the user is not given much functionality and flexibility while defining his search query and performing subsequent searches based on the results of his primary query.
The present systems are restrictive in the sense that they are centered towards ‘keyword based’ search strategy rather than a ‘context based’ search strategy.