This disclosure relates to identifying transient data in web pages.
The world wide web includes an enormous volume of information. Search engines can help to facilitate access to the content by enabling users to search for various topics. Search engines can operate to receive search queries from users and to provide search results associated with those queries to the users. To do this, the search engine can use an index to identify web pages that are relevant to the terms included in the search query. The index can be gathered by examining known web pages and developing key words used to be associated with the web pages. Many web include transient content (e.g., date, time, weather, etc.) which is not useful in identifying the relevancy of a web page to a search query. Transient data can also lead to improperly targeting advertisements by matching transient content, as opposed to the non-transient content. However, it can be difficult to identify transient content on a large scale without extensive computation.