Automatic document searching and classification are processes of algorithmically comparing documents to a search query and assigning documents to one or more classes, respectively. For example, documents may be searched or classified according to their content or other attributes, such as author, date, and subject. Some existing solutions search and classify documents based on document or text clustering. Text clustering uses descriptors, or sets of words, to group similar documents together. Other existing solutions render the documents in memory and attempt to isolate artifacts from the rendering. Artifacts from two documents can then be compared to identify any similarities. Given the increasing complexity and volume of electronically stored information, there is a need for improved search and classification techniques that overcome limitations of existing schemes.