Documents in digital form are pervasive, especially on the World Wide Web where ubiquitous access has made it possible to retrieve vast numbers of documents with only a few key strokes. However, this capability is hindered without the ability to automatically generate accurate and concise text summaries to aid in the selection and categorization of documents. Ubiquitous access also implies that documents are viewed on many kinds of devices (e.g., mobile computers, personal digital assistants, hand-held computers, wrist computers, cellular telephones, etc.) having a variety of display formats from large to very small. Documents that are easily viewed on large displays become unmanageable on small ones. Here, text summarization is needed to reduce the size of documents to accommodate different display formats.
One approach to summarizing text is to extract sentences from a source text using a structural analysis technique. Structural analysis techniques employ the semantic structure of text to rank sentences for inclusion in a final summary. The structural summarization method is also well suited for producing summaries of varying sizes to accommodate a wide variety of display formats. However, the resulting summaries may include information that could be reasonably omitted or generalized without a significant loss of information content. Post processing to suppress some information is one effective method of reducing summary text size, although some loss of readability may result. Therefore, it is desirable to reduce the size of text summaries generated by structural analysis techniques while preserving readability and fidelity to the source text.