There is continuing research in the area of topic identification. Previous methods in this area are based on the use of keywords. A disadvantage of such a method is that any variation in the spelling of a keyword without any significant change in meaning might cause the performance of the method to degrade. One proposed solution to this problem is to use a dictionary, thesaurus, or semantic index to generate variations of the keyword. This suggestion improves performance when there is a spelling variation without a change in meaning, but causes further performance degradation when there is a change in meaning in the presence of similar spelling.
U.S. Pat. No. 5,418,951, entitled “METHOD OF RETRIEVING DOCUMENTS THAT CONCERN THE SAME TOPIC,” discloses a method of identifying the topic of a document using segments of text called n-grams, where ˜1 indicates the number of characters in the textual segment. The present invention does not use n-grams to identify the topic of text as does U.S. Pat. No. 5,418,951. U.S. Pat. No. 5,418,951 is hereby incorporated by reference into the specification of the present invention.
U.S. Pat. No. 5,937,422, entitled “AUTOMATICALLY GENERATING A TOPIC DESCRIPTION FOR TEXT AND SEARCHING AND SORTING TEXT BY TOPIC USING THE SAME,” discloses a method of identifying a topic of text by using the definition of each word in the text. The present invention does not require the use of the definition of each word in a text as does U.S. Pat. No. 5,937,422. U.S. Pat. No. 5,937,422 is hereby incorporated by reference into the specification of the present invention.
U.S. Pat. No. 6,638,317, entitled “APPARATUS AND METHOD FOR GENERATING DIGEST ACCORDING TO HIERARCHICAL STRUCTURE OF TOPIC,” discloses a method of calculating a lexical cohesion degree at each position in a document and extracting key sentences and generates a digest based on the relationship between a target passage and a passage containing the target passage. The present invention neither extracts sentences nor compares a target passage to another passage containing the target passage as does U.S. Pat. No. 6,638,317. U.S. Pat. No. 6,638,317 is hereby incorporated by reference into the specification of the present invention.
U.S. Pat. Appl. No. 20030167252, entitled “TOPIC IDENTIFICATION AND USE THEREOF IN INFORMATION RETRIEVAL SYSTEMS,” discloses a method of identifying a topic of text by identifying the most frequently occurring combinations of words in the text. The present invention does not identify the topic of text by identifying the most frequently occurring combination of words in a text as does U.S. Pat. Appl. No. 20030167252. U.S. Pat. Appl. No. 20030167252 is hereby incorporated by reference into the specification of the present invention.
U.S. Pat. Appl. No. 20030182631, entitled “SYSTEMS AND METHODS FOR DETERMINING THE TOPIC STRUCTURE OF A PORTION OF TEXT,” discloses a method of identifying a topic of text using a Probabilistic Latent Semantic Analysis. The present invention does not identify the topic of text by using a Probabilistic Latent Semantic Analysis as does U.S. Pat. Appl. No. 20030182631. U.S. Pat. Appl. No. 2003018263 1 is hereby incorporated by reference into the specification of the present invention.
U.S. Pat. Appl. No. 20040122657, entitled “SYSTEMS AND METHODS FOR INTERACTIVE TOPIC-BASED TEXT SUMMARIZATION,” discloses a method of identifying a topic of text using key phrases, n-grams, and sentences. The present invention does not identify the topic of text by using key phrases, n-grams, and sentences as does U.S. Pat. Appl. No. 20040122657. U.S. Pat. Appl. No. 20040122657 is hereby incorporated by reference into the specification of the present invention.
U.S. Pat. Appl. No. 20040205457, entitled “AUTOMATIC SUMMARISING TOPICS IN A COLLECTION OF ELECTRONIC DOCUMENTS,” discloses a method of identifying a topic of text using vectors of terms and sentences to create a correlation matrix. The present invention does not identify the topic of text by using vectors of terms and sentences to create a correlation matrix as does U.S. Pat. Appl. No. 20040205457. U.S. Pat. Appl. No. 20040205457 is hereby incorporated by reference into the specification of the present invention.