1. Field
This disclosure relates generally to electronic documents and, more specifically, to techniques for identifying a matching search term in an image of an electronic document.
2. Related Art
Metalogging refers to the process of adding descriptive information (metadata) about an electronic document, e.g., a file including a digital image, and storing the information in such a way that the information can be used to retrieve the document from, for example, a database. Metadata is typically embedded (stored inside a header) in a document in various fields. Various applications have been configured to search metadata to locate documents that correspond to a particular search term. Electronic images have been created in a number of different manners (e.g., by an application, a digital camera, or via scanning) and take various forms (e.g., digital photographs, flowcharts, and diagrams). Today, the information technology (IT) industry is increasingly challenged to electronically comprehend documents that include images. For example, it has been estimated that approximately fifty percent of electronic documents (e.g., files having the file extensions .html, .doc, .pdf, .lwp, .rtf, etc.) contain images.
In general, commercially available document editor/viewer applications, such as Microsoft Word, Open Office, Lotus SmartSuite™, Internet Explore (IE), Mozilla Application Suite, etc., provide a user with a search functionality (popularly known as “Find” or “Find/Replace”) that allows the user to search text of an opened document for a search term (e.g., a word/phrase) which, when found, has been highlighted (by the application) within the document text with a selected color. However, known document editor/viewer applications do not highlight a search term that is included in a document image, even when the document image includes content that exactly matches the search term. That is, known document search algorithms ignore document images and only search document text for a user entered search term. Unfortunately, documents frequently include images that have content that matches a user entered search term and, as such, may be of interest to a user. While some search algorithms have searched metadata associated with images, image metadata has generally only included information (such as image resolution, image description, author, etc.) that has been used to assist a user in locating images of potential interest that are stored on, for example, personal computers (PCs) or servers coupled to the World Wide Web.
With reference to FIG. 1, an example screen dump 100 is provided to illustrate operation of a conventional document editor/viewer application. As is illustrated in the screen dump 100, a user has opened a sample document 102 that includes both text 104 and images 106 and 108. In the sample document 102, the document text 104 and the document image 108 both contain the search term “computer” and the document image 106 includes an image of a computer. In conventional document editor/viewer applications, the user may utilize an editor search facility associated with window 110 to find and highlight the search term “computer” in the document text 104. However, conventional document editor/viewer applications have not highlighted matching search terms within the document image 108, even when the document image 108 has contained information that corresponds to the search term. Moreover, conventional document editor/viewer applications have not highlighted images (in this case, the computer system image 106) within the document 102 that correspond to the search term. As conventional document editor/viewer applications have not identified a search term included as content of a document image, images with content of interest may not be located by a user of a conventional document editor/viewer application.