Accurate identification and extraction of data from business documents is an important aspect of computerized processing of business documents. Many business documents contain a regular set of information in the form of a label (or key) with an associated value. Such documents are usually formatted in a manner to be easily discernible to a human. While the documents have a discernible structure, they tend to have numerous variations that make computerized processing problematic and error prone. For example, the documents are typically received in image form, so the content must be extracted for computerized processing. This can lead to numerous errors. For example, two versions of the same document may have visual differences due to scanning differences, say at different resolutions, or because of visual artifacts in the documents. Moreover, it is often the case that the same type of business document, such as an invoice for example, will have differences in formatting, differences in terminology and differences in the granularity and amount of information. There is accordingly a need for improved computerized processing of business documents.