Similarity documents comparison measures the similarity of documents based on the content of each document. These measures have uses in different areas that utilize the similarity or dissimilarity of two or more documents. For example, clustering techniques groups different documents according to the similarity of the documents.
However, some similarity measures return a vector instead of a single value metric. A vector similarity measure is more difficult to use than a single-valued metric to determine whether documents are more or less similar For example, consider a vector similarity measure with two components, and three documents A, B, and C. In this example, the vector similarity measure between documents A and B is (0, 1), and the vector similarity measure between documents B and C is (1, 0), This measure does not provides information to determine whether document B is more similar to A or more similar to C.