Conventional computer systems enable production and distribution of multimedia data, including video, audio and image data. Such production is increasing at a phenomenal rate due to the growing popularity of the Internet, the growing affordability of personal computers capable of efficiently processing multimedia data to provide a pleasing experience for users, and the enhanced viewing experience that multimedia provides over “text-only” type images.
People now access and use multimedia data in numerous ways. One way that people access multimedia data is over a network such as the Internet. For example, people using web browsers on personal computers now routinely access multimedia data by surfing the World Wide Web via the Internet.
Countless numbers of content providers link multimedia data to web pages accessible by people using web browsers. Accordingly, via use of today's web pages, persons using media player applications (e.g., web browsers) can access a web server operated by a content provider to playback content such as view video clips, listen to audio clips, or view images made available by the content provider.
To request media content such as digital video, audio, or some other sampled content from a server, a corresponding client typically provides the global address of the content in the form of a Uniform Resource Locator (URL). After receiving a request for content, a respective server receiving the request then accesses the content and sends or “streams” it to the client as a continuous data stream that is played back by the client as the stream is received. Alternatively, the client can play the content for viewing by the user (e.g., via a media player application) upon completion of receiving all of the content data from the server.
In addition to large amounts of audio, video, and image data, content providers distribute large amounts of data over internal and/or external networks. Often the data is text providing information about transactions, people, companies, etc. In order to derive high quality information from the text, Conventional text mining techniques are applied to the text. Producing “high quality” information in conventional text mining usually involves some combination of determining the relevance, novelty, and interestingness of parts of text. When processing large amounts of text, “high quality” information is typically derived via utilizing various models of statistical pattern learning. More specifically, conventional text mining on large amounts of text usually involves the process of structuring the input text (such as parsing), detecting patterns within the structured data, and evaluating and interpreting the output. Other typical text mining tasks include text categorization, text clustering, concept/entity extraction, sentiment analysis, document summarization, and entity relation modeling (i.e., learning relations between named entities).