1. Technical Field
Embodiments of the present invention relate to computer-aided analysis of media material.
2. Related Art
Computers are increasingly being used to perform or aid the analysis of documents and printed material. Layout analysis techniques and systems have been used to analyze the location, relative arrangement and relationship of text and images in a document. Such document layout analysis can be important in many document imaging applications. For example, document layout analysis can be used to identify the location and relative arrangement of captions and associated images in a media material. Caption identification, based on existing techniques, generally works best on simple documents, but may be difficult or even unworkable when caption placements are complex or variable. For instance, traditional automated or semi-automated document layout analysis often fails on complex layouts and resort must be made to manual analysis.
Providing reliable caption identification across a broad variety of layouts creates special challenges. Simple optical character recognition (OCR) based attempts to detect image captions are often inadequate and frequently fail to identify the correct caption. For example, to identify captions for a reader, media material creators may use a variety of features—such as type fonts, capitalization, specific identifying tag words and abbreviations (e.g., fig., figure, caption), type sizes, type styles (boldface, italics, underline, etc.) and caption position relative to the image. Each piece of media material may use a different combination of these features, and also may have different caption identification features in use simultaneously. Traditional methods of caption identification do not analyze the multitude of different textual, proximity and content features available to assist in an accurate identification of captions.
Automated methods of identifying captions have generally relied on text position, document layouts and other proximity features alone and hence make many mistakes, as there is no consistent, adaptive method of identifying captions that works across a wide variety of media material. Such limited automated methods have further difficulty analyzing captions and images that continue across two or more pages of a media material.
What is needed are improved systems and methods for identifying captions in media material.