The present exemplary embodiments relate generally to computer vision and more specifically document recognition which is an application for computer vision. They find particular application in conjunction with computer vision applications, such as clustering, classification, retrieval, and repeated structure finding, and will be described with particular reference thereto. However, it is to be appreciated that the present exemplary embodiments are also amenable to other like applications.
Recent years have seen a surge in bag-of-words approaches to image categorization. Under such approaches, objects and scenes are modeled as large vectors of relatively simple feature measurements. An issue is the information captured by a feature. Features have traditionally been purely appearance-based features, which measure local shape and texture properties. However, a recent trend has aimed at extracting information in spatial relationships among features measurements sampled at keypoints or interest points. A notable example of encoding geometry in localized features occurs in document image indexing, where “fingerprints” describe the spatial configurations of word blobs.
One way of encoding spatial configuration is through graphs. Therein, objects and scenes are modeled as parts (nodes) and relations (links). An observed image generates a graph of observed parts and their relations to other parts in the local neighborhood, and recognition is performed by subgraph matching.
Subgraph matching poses certain difficulties. First, it is known to be exponentially expensive. This problem is to some extent alleviated by use of attributed graphs (i.e., graphs whose nodes contain properties that constrain possible matches). Nonetheless, subgraph matching has been limited to relatively small subgraphs due to a second difficulty. This is that noise and variability cause observed graphs to deviate from ideal models. This demands the use of inexact graph matching techniques, which drastically increases matching cost and largely removes the advantages of attributed graph matching because possible matches of differently-labeled nodes must now be explored.
Similar to image categorization, the difficulties noted with subgraph matching also pose problems for image retrieval and detection of repeated structure. Namely, image noise and variability make it difficult to quickly and efficiently perform the matching necessary for carrying out said tasks.
In view of the foregoing, it would be advantageous to have methods and/or systems that address the foregoing problems. The disclosure hereafter contemplates such methods and/or systems.