1. Technical Field
The invention relates to real time text analysis and access thereto. More particularly, the invention relates to a method and apparatus for real time extraction of text from a collection of sources and performing analysis on the extracted text for a more intuitive access by an end user.
2. Description of the Prior Art
For ages, people have been interested in learning about and discovering information. In the 1950's, a popular trend was for parents of young children to buy a set of encyclopedias for their children to have at hand to research topics, such as history and science, for school projects and the like. In the digital age, the quest for information has only become more relevant. Larger and larger quantities of digital content have become more available to an ever increasing global audience, as the economics of the supportive technology drives down the cost. Using search engines to retrieve and discover digital content information on the Internet or within enterprises is well-known. For example, suppose a researcher needs information about President Theodore Roosevelt. Typically, the researcher enters a query for a search engine from the researcher's browser. The query is typically one or more keywords, such as, “president theodore roosevelt”. The search engine processes the query or list of keywords in accordance with its particular algorithm and presents to the researcher a list of web pages that contain one or more of the researcher's keywords. Depending on the particular search engine, other related data or metadata, such as the title of a pdf file that is available for download as indicated by the title being a hyperlink as well as some sample text from the pdf file, is presented on the researcher's browser. Should the researcher desire to learn more from this particular pdf file, the researcher clicks on the link which causes the pdf file to be downloaded onto the researcher's computer. From that point on, the researcher is left with an entire pdf document in which to search, retrieve, and discover pertinent information. If the pdf document is particularly large, perusing through the document to identify desired information or even just to discover what information therein may be relevant can be a laborious, time-consuming, and daunting task.