There are a number of types of directed electronic message streams in common use, such as: emails, short message service (SMS), instant messaging (IM), social media, blogs, faxes, really simple syndications (RSS), etc. This list continues to grow as new message streams are developed and implemented. The effective analysis and/or categorization of the digital information contained within these message streams continues to be a problem for many companies and other organizations. In addition, the type and volume of these encoded payloads is growing significantly and the techniques for analyzing and/or categorizing the content has become problematic.
For example, the growth in email messages passing between multiple senders and recipients, as both one to one and, one to many directed messages continues to expand. These messages can contain both textual information, meta data and zero or more attachments in the form of encoded payloads. Encoded payloads typically consist of office documents or multimedia documents but may include other information such as URLs. Some examples of these payloads are: word (or similar) documents, presentation (PowerPoint or similar) documents, adobe documents (pdfs), spread sheets (excel or similar) documents, images formatted as jpegs, gifs, pngs, tiffs, videos formatted as avis, asfs, mkvs, mpegs, audio formatted as mp3s, aiffs, wavs, URLs or document IDs etc.
In certain environments, such as the work place, the growth in electronic message information has resulted in several management issues or problems. For a number of reasons, companies and other organizations have a growing need to better understand the content and be able to categorize the electronic message information that is being circulated. For example, valuable IT storage space is being used for things such as non-work related videos, personal entails, etc. The percentage of non-business emails and attachments that are received and transferred around within a network continuo to grow. In addition, the proliferation of potentially inappropriate inbound and outbound activity (such as pornography, cyber-bullying, sensitive materials that could be stolen and emailed out of the company) has become a serious problem for many organizations. There is a growing need for organizations to analyze and/or categorize directed electronic message streams efficiently. Some methods have been developed to categorize electronic message streams, however, these have not proven to be sufficient enough to take the actual context and content from the electronic message streams and use this information for categorizing the content.
Among the problems with known systems is that only very basic metadata and data/contents of electronic message streams are used for categorization. However, this misses out on the context of different electronic message streams, which can often be important in achieving successful analysis and/or categorization.
A compelling need has been recognized in connection with providing efficient and effective analysis and/or categorization of the digital content of these electronic message stream. The present disclosure, addresses these and other problems that exist in the art.