The following description includes information that may be useful in understanding the present invention. It is not an admission that any of the information provided herein is prior art or relevant to the presently claimed invention, or that any publication specifically or implicitly referenced is prior art.
Search engines, such as Google, Bing, and others search and index vast quantities of information on the Internet. “Crawlers” (a.k.a. “spiders”) follow URLs obtained from a “queue” to obtain content, usually from web pages. The crawlers or other software store and index some of the content. Users can then search the indexed content, view results, and follow hyperlinks back to the original source or to the stored content (the stored content often being referred to as a “cache”). Computing resources to crawl and index, however, are not limitless. The URL queues are commonly prioritized to direct crawler resources to web page servers which can accommodate the traffic, which do not block crawlers (such as according to “robots.txt” files commonly available from webpage servers), which experience greater traffic from users, and which experience more change in content.
Conventional search engines, however, are not focused on price and product information. If a price changes on a webpage, but the rest of the webpage remains the same, traditional crawlers (or the queue manager) will not prioritize the webpage position in the queue, generally because the price is a tiny fraction of the overall content and the change is not labeled as being significant; conversely, if the webpage changes, but the price and/or product information remains the same, the change in webpage content may cause a traditional crawler to prioritize the webpage position in the queue due to the overall change in content, notwithstanding that that price and product information remained the same.