1. Field of the Invention
The present invention relates to event mining and, more particularly, detecting new events from a social stream.
2. Description of Related Art
The problem of text mining has been widely studied in the information retrieval community because of the ubiquity of text data availability in a wide variety of scenarios such as the web, social networks, news feeds and many others. Much of the text data arises in the context of temporal applications such as news feeds and social network streams, in which the text arrives as a continuous and massive stream of documents. Streaming applications present a special challenge to such problems because of the fact that it is often necessary to process the data in a single pass and one cannot store all the data on disk for re-processing.
An important problem in the context of temporal and streaming text data is that of online event detection, which is closely related to the problem of topic detection and tracking. This problem is also closely related to stream partitioning, and attempts to determine new topical trends in the text stream and their significant evolution. The idea is that important and newsworthy events in real life (such as the recent unrest in the middle east) are often captured in the form of temporal bursts of closely related documents in a social stream. The problem can be proposed in both the supervised and unsupervised scenarios. In the unsupervised case, it is assumed that no training data is available in order to direct the event detection process of the stream. In the supervised case, prior data about events is available in order to guide the event detection process.