For example, in order to monitor competitors it is vital to observe the web site providing information on the competitor's business. Monitoring web sites, for example, allows marketing analysts or strategic development officers to identify new products competitors have released, recent events taking place, and other facts that may be of value for adapting the own company's marketing strategy. It may be of interest, for example, to know about road shows or large marketing campaigns of other market participants.
Conventional manual surveillance of web sites is generally not feasible due to the abundance of information stored in hundreds or thousands of web pages bearing content. Manual handling of such huge information resources is extremely time-consuming and vulnerable to errors when one tries to track changes over time. Conventional automatic analysis of web pages is also prone to errors due to the fact that a URL (Uniform Resource Locator) as a reference to an information providing web page is assumed to be stable over time. But, often web pages are generated from databases that lead to changing site internal URLs (Uniform Resource Locator)and thus are not stable. Also session management protocols may have an impact on the web pages site internal URL-structure.
Conventional change monitoring systems provide graphical interfaces for highlighting changes that have been made on a web page with respect to a preceding inspection. However, highlighting often relies on a user specified labeling of relevant areas on a web page or in the underlying html-code.
Therefore it is desirable to obtain a method for tracking changes in the content of a web site that is easy to implement and robust against changes of the identifiers of the included web pages, as for example the URLs, and content variations due to embedded ads and dynamic content allocation.