This specification relates to identifying and presenting news events.
As the Internet has increased in popularity, the number of news articles available on the Internet has also increased. The large number of articles on a given topic, e.g., news stories about a particular news event, can make it difficult for a user to quickly gain an understanding of a history of the topic. Users must generally read through many articles, which often provide redundant information, before forming an understanding of a topic.
One way to help users sift through the large amount of information available to them is to cluster articles on a topic according to keyword-based clustering. Articles with similar terms are clustered together. However, articles on the same topic generally share many of the same keywords, and later articles on a new aspect of a topic will often recap events that happened earlier in the history of the topic. Thus, keyword based clustering is not always an accurate, or useful, way to group articles that are related to the same general topic.