1. Field
The field generally relates to the Internet.
2. Background Art
The emergence and development of computer networks and protocols, including the Internet and the World Wide Web (or simply “web” or ‘Web”), has allowed many users to view and enjoy content served from remote locations over the web. When content, such as news content or media content, is distributed across the Internet or the Web, the content is published and re-published by multiple content sources. When the content contains popular, “hot” or interesting subject matter the content is likely to be republished by multiple content sources. The content is also likely to be accessed by a greater number of people and for a long period of time.
However, since numerous venues or content sources publish and republish content, conventional content providers cannot easily track their content across the Internet. Although content providers and publishers can determine some content sources that have republished the content using a brute force approach of comparing the original text of the content with text published at different content sources, they cannot easily or meaningfully track and analyze the content as it is republished by multiple content sources. Content providers further lack insight into flow characteristics of content being spread across the Internet and cannot gauge the popularity of content across the Internet or the rate and timing of content publication carried out by other content sources.