The present invention relates generally to methods and apparatus for identifying transitions in video data, and more specifically relates to new methods and apparatus for evaluating video data comprising presentations of still images to determine the transitions from one still image to another.
Many techniques are known for identifying transitions such as changes of scene, in video presentations. One use for such identifying such as transitions is the identification of chapter markers within the video presentation that allow a viewer to selectively move to a desired location in the presentation. In order to facilitate that type of navigation, the frames associated with the chapter markers are often presented to the user as an index, allowing identification of the subject matter at each location, as well as navigation to a selected “chapter” of the video presentation.
Because of the inherent nature of video in presenting persons or objects in motion, previous attempts to bring some degree of automation to identifying transitions have focused on ways of evaluating frames of data based on changes of “scenes” potentially reflecting a sufficient change in visual content to warrant identification with a chapter marker. Thus, such changes between scenes in such conventional motion-conveying video presentations have focused on various parameters in the motion-conveying video that one might ordinarily associate with changes of content in the video presentation, such as changes in contrast and/or color (potentially indicating the depiction of a new environment or “scene”); or detection of parameters indicating the depiction of motion, which may be of numerous forms including that which might result from a change in the observation perspective (resulting from a change of camera position such as by panning, tilting, zooming or rotating the camera), or motion of a person or object in the video presentation.
While these methods offer varying qualities of results in evaluating conventional motion-centric video presentations, the methods are not believed to be well-suited to detecting changes resulting from the depiction of one still image followed by another still image presented in a video. One example of this type of video presentation can be envisioned as a static or slowly panning depiction of still images, such as drawings or paintings, accompanied by a narration. If two time-offset still images are close to one another in color and contrast, then the change from one image to the next may be hard for conventional systems to identify, although identification of an index, such as a chapter marker might be very desirable. These problems may be exacerbated by gradual transitions between the still images. A particularly problematic video type would be one depicting a series of largely text-based and/or static image-based “slides” in a video of a “slide” presentation such as those used in business and education, and prepared and presented through use of a conventional presentation authoring program such as Keynote® from Apple Inc. or PowerPoint® from Microsoft Corp.
In examples such as these slide presentations, particularly where they are primarily text-based, the background will often remain constant or generally constant, and the overall differences between consecutive slides may be relatively limited. Additionally, such slide presentations often include relatively slow-changing animations to transition between slides, such as slow “fades” from one image to another or similar effects, which do not provide images usually detectable as movement between the video frames. Thus, conventional transition identification systems are believed to be less than optimally suited to identifying the change from one still image to another still image.
Accordingly, the present invention provides new methods and apparatus to evaluate the video data underlying such video presentations, and to identify changes from one still image to another in those video presentations.