As the use of computers and computer-based networks continues to expand, content providers are preparing and distributing more and more content in electronic form. This content includes traditional media such as books, magazines, newspapers, newsletters, manuals, guides, references, articles, reports, documents, etc., that exist in print, as well as electronic media in which the aforesaid content exits in digital form. The Internet, in particular, has facilitated the wider publication of digital content through downloading and display of images of content. As data transmission speeds increase, more and more images of pages of content are becoming available online. Page images allow the reader to see the page as it would appear in print. Furthermore, graphics, such as charts, drawings, pictures, etc., and the layout of such graphics in a page, are not lost when the page of content is provided as a digital image.
Despite the great appeal of providing digital images of content, the cost of storing images of content remains a concern for many content providers. To minimize storage costs, content providers desire to minimize the size of files used to store the images. Digital images may be represented at a variety of resolutions, typically denoted by the number of pixels in the image in both the horizontal and vertical directions. Typically, though not always, higher resolution images have a larger file size and require a greater amount of memory for storage. The cost of storing images of content can greatly multiply when one considers the number of images it takes to capture and store large volumes of media, such as books, magazines, etc.
While reducing the size and resolution of images often reduces the requirements for storing the images, low resolution images eventually reach a point where, if too small, the image is difficult for readers to perceive when displayed. This problem is further exacerbated when the images represent pages of content containing text that readers desire to read. If the text in an image of content is not legible, the value of the image significantly decreases. Content providers wishing to provide page images with text that can be read must ensure that the images have sufficient resolution to provide legible text when displayed.
The legibility of text in a digital image is largely a matter of human perception. Content providers that have a significant amount of digital images of content face the difficulty of determining whether a given image of content has sufficient resolution to be perceived as legible by most readers. One solution is to employ human readers to visually inspect images of content to determine whether the images are legible. For large repositories of content, however, a process of human review can become inordinately time-consuming and expensive. What is needed is a method and system that can be implemented in a computer to process images of content and determine whether text in an image is likely to be legible to readers. The present invention addresses this need and other shortcomings in the prior art.