The field of the invention relates generally to information management, and more specifically, to methods and systems for classification and clustering of inbound communications to an organization or entity.
Organizations and businesses can receive a large number of messages from customers, potential customers, users and/or other people. For example, a business and/or organization can receive messages from its customers and potential customers, such as email messages, messages from online forums, e.g., support forums or message boards, and other types of messages. These messages can be related to a variety of different topics or issues. For example, the messages can be related to problems experienced by a user and can include a request for assistance to solve the problem. Oftentimes, these request messages are directed to a support center at the organization/business.
In addition, the Internet provides these organizations and businesses with access to a wide variety of resources, including web pages for particular topics, reviews of products and/or services, news articles, editorials and blogs. The authors of these resources can express their opinions and/or views related to a myriad of topics such a product and/or service, politics, political candidates, fashion, design, etc. For example, an author can create a blog entry supporting a political candidate and express their praise in the candidate's position regarding fiscal matters or social issues. As another example, authors can create a restaurant review on a blog or on an online review website and provide their opinions of the restaurant using a numerical rating (e.g., three out of five stars), a letter grade (e.g., A+) and/or a description of their dining experience to indicate their satisfaction with the restaurant.
Such a large volume of documents (i.e., different types of electronic documents including text files, e-mails, images, metadata files, audio files, presentations, etc.) can be very difficult for organizations and/or businesses to manage. Entities may try to use classification or clustering techniques to manage such a large volume of documents. Various algorithms can be used on a corpus of documents to produce different clusters of documents such that the documents within a given cluster share a common characteristic. Over time, new products or features offered by the organizations and businesses may cause documents relating to customer support issues or sentiment directed to the new products or features to be generated. Current classification and clustering algorithms will group the new documents into existing classes or groups that may not be directly associated with the new customer support issues or sentiment. Consequently, an awareness of the new customer support issues or sentiment may go unnoticed or an importance of an unrelated topic may be erroneously accentuated by increased numbers of documents in groups associated other topics.