1. Field of the Invention
This invention relates to an automatic text processing method, and in particular, to a method for generating summaries for text documents.
2. Background Description
In information query, for the user's convenience, it is normally required that the summaries are generated for the users by means of automatic text processing functions of the computers. The current practical methods for automatically generating summaries for text documents are following four kinds:                List the first paragraph of the document or the beginning paragraphs of the document as the summary (Infoseek, Yahoo, etc.): This method is simple, but does not suit for common document style;        List the sentences in the query commands (Lotus website, Beijing Daily Online, etc.): The listed sentences relate directly to the query, but cannot represent the overall style of the document;        Use implicit template: This method matches some patterns in the document, and then fills the matched contents into the pre-formed template. This method can generate very fluent summaries, but can only be suitable for a fixed document style and a specific domain, and is very difficult to be used commonly;        Count the occurrence frequency of words or characters: This is a statistics-based method, which generally can be divide into four steps: (1) analyze the document discourse, and segment the document into paragraphs and sentences; (2) segment the sentences into words; (3) evaluate the scores of the importance of the words and the sentences; (4) output the sentences with higher evaluated scores as the document's summary.        
Although the above statistics-based method for automatically generating summaries for text documents has considered the occurrence frequency of words and characters in documents and therefore evaluated the importance of the words and the sentences, the summaries can not well correspond to the user's requirements because there is no interaction with the user. Therefore, the invention proposes a method for automatically generating summaries for text documents, which, when receiving the user's text documents, queries the fields, topics, and terms that the user is interested in. The method extracts the important sentences, and then in reasonable order, outputs them as the document's summary. The method can not only generate summaries for respective documents, but also generate a comprehensive prompt for the important ideas of the documents.