1. Technical Field
The present invention relates generally to the field of multimedia content analysis and, more particularly, to a system and method for distinguishing between foreground content and background content in an image presentation.
2. Description of Related Art
Figure-ground separation relates to the capability of distinguishing between foreground material and background material in an image, and is a fundamental problem in image processing applications. For example, consider an image that contains a boy playing with a ball on a beach. If the objective is to identify the boy, the boy with the ball is the foreground and the beach is the background. If, however, the objective is to identify the beach or waves breaking on the beach, the beach becomes the foreground and everything else in the image becomes background.
The human visual system is able to effortlessly separate foreground content from background content in a viewed image by combining various clues to decipher the foreground based on current interest. An image processing system also faces the task of figure-ground separation because further image recognition procedures can proceed effectively only if the foreground content of an image is first well separated from background content. Most image processing systems use externally determined policies to drive a figure-ground separation module, such that the system knows ahead of time what the foreground of an image is expected to be. If the foreground is not known in advance, however, problems may occur in correctly separating foreground content from background content in different types of image presentations.
When communicating to an audience, for example, when giving a speech or teaching a course to a group of students; it is a common practice to write or otherwise provide information on a transparency, such as a slide or a foil, and to project the information onto a screen using a projector so that the information may be easily viewed by the audience. Recently, computer-generated presentation has become a popular and professional way to provide visual materials to an audience. With computer-generated presentation, a computer is directly connected to a digital projector, thus avoiding the need for physical media such as slides or foils. As used in the present application, a presentation is any document that may contain one or more types of media data such as text material, images and graphics. Some examples of computer-generated presentation types include digital slide presentations, Web page presentations, Microsoft Word® document presentations, and the like.
A digital slide presentation, for example, is created using computer software such as Microsoft Power Point® or Lotus Freelance Graphics®, rather than being hand-drawn or hand-written as with conventional slides. Digital slides commonly include text-based information, and may also include some figures, tables or animation materials. Inasmuch as Power Point and Lotus Freelance Graphics design templates provide a rich set of choices, the pages in these presentations (i.e., the individual slides) often include a relatively complex background, for example, a background that varies in color or texture, or a background that includes one or more images; rather than a blank background or a background of a single uniform color. A user often selects relatively complex backgrounds for an image presentation to improve the clarity and visual appeal of the presentation and to satisfy aesthetic preferences.
While the diverse backgrounds available using Microsoft Power Point or Lotus Freelance Graphics can be effective in enhancing audience attention, the backgrounds also present a severe challenge to the problem of automatic presentation content analysis. For example, slide text recognition techniques have been effectively used to extract text from a slide so that the text can be used to index and annotate slide content for archival and search purposes. A complex slide background, however, affects the text recognition accuracy to a certain extent because separation of foreground text embedded in a complex background becomes very difficult with automated techniques. This is because most existing text separation techniques make initial assumptions about the kinds of background that can be present in the pages, in order to control pixel variations that must be handled.
It would, accordingly, be advantageous to provide a system and method for distinguishing between foreground content and background content in an image presentation, such as a computer-generated image presentation, that is effective with diverse types of backgrounds including relatively complex backgrounds such as backgrounds that vary in color or texture or that include one or more images.