1. Field of the Invention
The present invention relates generally to data collection, organization and analysis and more particularly, the present invention relates to collection, categorization and analysis of electronic discussion messages.
2. Background of the Invention
Electronic discussion forums have been used in the art to facilitate communications between two or more people. Such electronic discussion forums typically allow for exchange of information, ideas and opinions over an extended period of time, i.e., a discussion about a particular topic may be initiated by an individual posting a message on day one, and subsequent discussion participants may receive, view or respond to the message at a later date. Such discussion forums allow even participants new to the forum to review past discussion messages and therefore to fully participate in the forum. Well-known examples of such electronic forums include Web-based and proprietary message boards (both public and private), USENET news groups, and electronic mailing lists. These electronic discussion forums support both synchronous and asynchronous discussions, i.e., one or more participants may inject communications into the discussion at the same time, or nearly the same time, without disrupting the flow of communications. This allows each individual electronic discussion forum to be rich with communications spanning a wide variety of topics and subjects.
Other electronic discussion forums, such as interactive chat sessions, facilitate more traditional asynchronous-like communications. In these discussion forums, participants are typically online at the same time and are actively responding to messages posted by others. These discussion forums are similar to a traditional telephone discussion in that the information in exchanged in real-time. However, a significant difference is that the electronic discussion forums are, by their nature, written or recorded message transmissions which may be saved for historical records or for analysis at a future date.
The wide-spread growth of the Internet has spurred numerous electronic communities, each providing numerous discussion forums dedicated to nearly any conceivable topic for discussion. The participants in a particular discussion may be geographically dispersed with worldwide representation or may be primarily localized, depending on the topic or distribution of the forum. For example, a mailing list devoted to planning for city parks in New York City may be only of interest to people having strong ties to the city or region, while a message board devoted to a particular programming language may have participants spanning the globe.
With so many different topics and subjects within each topic, and so many participants, a significant problem arises in attempting to capture and quantify the communications. Moreover, identifying trends and predicting future behavior in certain markets based on the communications has not been possible in the past because of the magnitude of the communications and the magnitude of topics and subjects. Further complicating any analysis of communications in electronic discussion forums is the fact that an individual may easily participate in multiple forums by posting the same message in several different discussion forums, and that individuals may use more than one identity when posting.