1. Field of Invention
This invention relates to the interactive topic-based summarization of text information.
2. Description of Related Art
In some conventional text summarization systems, statistically significant but disjoint portions of a text are selected for display. Many conventional text summarizers operating on English language texts select initial-introductory text portions or final-conclusory text portions from a text. The text portions are then used as representatives of the complete text and displayed to users seeking an indication of the information content of the complete text. The conventional text summaries generated by these text summarization systems do not provide an indication of the topics discussed within a single text.
A few conventional systems attempt to create informative summaries that represent or possibly replace the original text. These summaries tend to be longer than an informative summary and contain more factual detail. For example, Strzalkowski et al., in “A Robust Practical Text Summarizer” in Advances in Automatic Text Summarization, MIT Press, Cambridge, Mass., 1999, incorporated herein by reference in its entirety, discusses creating “longer informative digest that can serve as surrogates for the full text” by extracting a paragraph representative of a “Discourse Macro Structure”. This, conventional, known, two-part structure of “Background+What-Is-The-News” summarizes the two main parts of news articles.
Barzilay et al., developed a system that summarizes parts of documents using lexical chains to identify topics. Summaries are created by selecting a representative sentence from the strongest lexical chain. See for example, Barzilay et al., “Using Lexical Chains for Text Summarization”, in Advances in Automatic Text Summarization, MIT Press, Cambridge, Mass. 1999, incorporated herein by reference in its entirety. However, the summaries generated by these conventional systems are disjointed and difficult to read.