In a search process and particularly in a search process relating to products, due to different combinations of product words and in response to a certain product word entered by a user, a search engine may return peripheral products which are not highly related to a product relating to the product word entered by the user. For example, in response to the user entering MP3, the number of product word combinations corresponding to MP3 is very large. For example, products such as MP3 download cables and MP3 speakers may be found, but MP3 download cables and MP3 speakers are different products from MP3s. Because traditional searching performs a search based on key product word matching methods, searches can very easily return peripheral products which are relatively loosely related to the product relating to the query word string entered by the user. For example, as described above, if the user enters MP3 as a query word string, MP3 download cables and MP3 speakers have very high weightings in the search results in the search performed by the search engine. In other words, a large number of product information entries having a low correlation to the product corresponding to the query word string are present near the top of sorted product information entries returned by the search engine.
Two conventional technical methods exist to resolve interference by peripheral products having a low correlation to the product relating to the query word string entered by the user, as described above:
In a first technical method, categories are used to avoid a large number of peripheral results being found in the search results. The first technical method typically includes the following: first, based on log information, click through rates of categories corresponding to the user's query word string are tabulated, and then corresponding category tendencies relating to the query word string are determined. Weightings of product information entries which do not belong to the relevant categories are lowered. In other words, the weightings of product information entries contained in the returned search results which do not relate to the relevant categories are lowered.
With this method, substantial problems exist with respect to accuracy. For example, if mobile telephone batteries are placed in the mobile telephone category for purposes of fraud by sellers who distribute product information, peripherals products (mobile telephone batteries) will appear when mobile telephones are searched. Additionally, if a query word string is related to a plurality of categories, when analyzing the tendencies of the categories related to the query word string, if the click through rate of a certain category related to the query word string is very low, then this category can be easily overlooked. Accordingly, it is very difficult for the search engine to recall all categories related to the query word string resulting in a low search accuracy.
In a second technical method, the method includes online manual review of search results. The manual review method is used to determine peripheral word sets corresponding to each product word. In other words, if peripheral words appear in the search results, the method can determine that this product information entry having the peripheral words should not appear in the search results.
Although the accuracy of the manual review method is very high, the method requires the expenditure of a large number of man hours to perform the review, resulting in high labor costs.