Online content providers are increasingly moving towards building World Wide Web sites which are more reliant on dynamic, frequently-updated content. Content continues to be made available more and more via online auction sites, stock market information sites, news and weather sites, or any other such site whose information changes on a frequent basis, oftentimes daily.
Major search engines however, which enable Internet users to search for information on the World Wide Web, create search databases of information which rely on pages being static instead of dynamic. To create these databases, the search engine does what is known as “crawling” web sites by retrieving the content of a given Web page and storing it for later use. However, if the Web page is removed or changed even once in the several-weeks period between successive crawls, the search engine will display outdated or irrelevant information to the search engine user.
At a very high level, search engines download pages from Web sites and then build databases that index the content of these pages. These databases are then consulted when processing search requests to find the pages relevant to the query. There is a substantial delay from when a page is downloaded to when it is included in a database for processing queries. This delay is usually on the order of several days to several weeks. Many of the Web pages of, e.g., an online auction site such as eBay consist of a list of listings (frequently auctions) combined with navigational links and other information. Some pages of online auction sites display listings that meet some search criteria (search pages), others show items that are in a given category (category pages) or that are available for purchase from one auction site user's store (store pages). By default, those listings ending soon appear at the top of the list.
Many online auction site listings are auctions that last for three to ten days. In addition, some listings include an option that ends the listing immediately. It is not uncommon for a listings page to include items that all end in the next 24 hours. This means that online auction site pages included in a search engine usually contain titles from auctions that have already expired.
Because indexed online auction site pages become out of date very quickly, search engines cannot accurately match queries against such pages. The text matched against a user's query is likely to be from auctions that are no longer available. Therefore, the page is not relevant to the user's query unless new similar listings are available when the user clicks on the search result.
There are several ways search engines have addressed these issues. One option has been to exclude pages containing dynamic content from the search results. Another has been to treat pages containing dynamic content like any other web page. This results in the issues described above. Finally, in the case of auction sites, another option has been not to include auction titles in the text indexed in the search database.
Matching outdated, e.g, eBay listings pages to search queries can erode search user trust. The relevance problem that occurs when a query matches the title of an expired listing contributes to this. Returning irrelevant web pages in search results will cause search users to trust our results less and possibly switch to other search providers.
The problem is worse when a search engine displays titles of expired listings in a search result abstract. If this listing title catches a searcher's interest, it is likely that he or she will click through (i.e., click on the search result to go to a content page referenced by the search result) and not find the listing we claimed was available in our search result abstract. Thus, there is a need for a system and method which overcomes one or more of the aforementioned drawbacks.