1. Technical Field
The present invention relates to text document representation and computer analysis.
2. Discussion of the Related Art
In an effort to derive high-quality information from text documents, a variety of text mining and management algorithms exist such as clustering, classification, indexing and similarity search. Most of these applications use the vector-space model for text representation and analysis. The vector-space model is an algebraic model for representing text documents as vectors of identifiers, such as, for example, index terms. While the vector-space model is an effective and efficient representation for mining purposes, it does not preserve information about the ordering of the words in a document.