Thumbnails are commonly used as visual aids in document browsing and retrieval applications. The thumbnails are typically generated by scaling the document image. The scaling that is performed may be solely a geometric scaling operation such as traditional downsampling. There are a number of others ways to scale document images. One such way is to perform scaling that allows for layout distortion. For example, SmartNail technology focuses on showing selected readable text in a display window of fixed size. With SmartNail technology, preservation of layout is surrendered in favor of readable text see U.S. patent application Ser. No. 11/023,142, entitled “Semantic Document Smartnails”, filed Dec. 22, 2004. Other techniques include combinations of geometric and layout scaling. For example, a technology, referred to herein as Dynamic Document Icons, focuses on capturing distinct layout characteristics while neglecting readability of text regions. In contrast to SmartNail technology, in Dynamic Document Icons, the size of the icon is not fixed, but depends on the content shown in iconic form. For more information on Dynamic Document Icons, see K. Berkner, K., U.S. patent application Ser. No. 11/019,802, entitled “Dynamic Document Icons”, filed Dec. 21, 2004.
Graph models are popular in the document analysis field to capture information about document layout. Graph models may be derived in a number of ways. One example of a way to derive a graph model is described in Aiello M., Monz, C., Todoran, L., Worring, M., “Document Understanding for a Broad Class of Documents,” International Journal on Document Analysis and Recognition (IJDAR), vol. 5(1), pp. 1-16, 2002. In this reference, centers of text zones are modeled as vertices, and edges between vertices signal neighborhood relationships between associated zones. This information is required for further logical analysis including extraction of reading order and classification of text zones.
Graph models in general are frequently used in document analysis for analysis of web pages or table structures. Operations on graphs include graph matching techniques that may be used to compare different graphs. An overview of this field is given in Lopresti, D., Wilfong, G., “A Fast Technique for Comparing Graph Representations with Applications to Performance Evaluation,” IJDAR, vol. 6, pp. 219-229, 2004.
White space in documents is often used to identify the space between items, such as columns of text in a document. There are several methods of computing white space in document images. One way is presented in Breuel, T., “An Algorithm for Finding Maximal Whitespace Rectangles at Arbitrary Orientations for Document Layout Analysis,” Proceedings of ICDAR, 2003 Aug. 3-6; Edinburgh, Scotland, pp. 66-70. 2003. Proprietary OCR systems may have their own way to detect white space in order to support extraction of text components.
Another technology for white space expansion is discussed in U.S. Pat. No. 5,592,574, entitled “Method and Apparatus for Expansion of White Space in Document Images on a Digital Scanning Device,” to Chilton, J. K., Cullen, J., Ejiri, K., issued Jan. 7, 1997. As discussed in U.S. Pat. No. 5,592,574, in order to obtain better visibility white space between document objects is increased.