The present invention relates to a method for converting a video sequence into a browsable electronic image book by analyzing the video sequence, detecting specific events contained therein and automatically laying out representative images extracted from the video sequence based on the events, and the invention also pertains to a medium having recorded thereon a program for implementing the method.
FIG. 1 depicts an example of a video monitor screen on a computer according to the prior art, FIG. 2 an example of a video editing user interface on a computer according to the prior art, and FIG. 3 a prior art example of a representative image array of a video displayed in a scrolling window. In the figures, reference numeral 20 denotes a monitor screen, 21 an on-time-axis playback controller, 22 images sampled at regular time intervals, 24 a time axis, 23 sampled images, and 25 a scroll bar.
With an interface for interactive browsing of a video sequence in some form, it is conventional to effect playback control on the time axis as is seen in a video player. That is, it is common to control playback pointers on the time axis for start and stop of playback, fast forward and so forth. In the case of a digital video on a computer, its playback can be started at an arbitrary point directly specified on the time axis. For example, in such a video monitor as shown in FIG. 1, when specifying one point on the time axis of the on-time-axis playback controller 21 with a pointing device, images corresponding to the specified and subsequent points in time are displayed on the monitor screen 20.
To make browsable a digital video which is handled on a computer, there are cases where the system uses an interface of the type wherein reduced images obtained by sampling the video at equal time intervals are aligned along the time axis on the horizontal axis to determine the section to be edited, or plural images are listed as is seen in a desktop video editing system called a nonlinear editor. FIG. 2 shows an example of such a video editing user interface. A user can easily edit the video by manipulating the reduced images 22 sampled at equal time intervals as depicted in FIG. 2.
There is also such an interface as depicted in FIG. 3 which detects cut points and similar events in a video sequence, samples images based on the detected events and displays them as a two-dimensional array of sampled images 24 in a window on the computer screen. The sampled images 24 can be scrolled by operating the scroll bar 25.
In Japanese Pat. Laid-Open No. 198331/97 entitled xe2x80x9cElectronic Filing System with a Data Management-Output Method,xe2x80x9d there is disclosed an electronic filing system of the type having layout edit facilities enable system-managed image data or the like to be displayed in easily readable form of an album or book and permits arbitrarily changing of the display method.
With the prior art described above, however, it is difficult to meet a wide variety of user""s requirements of, for example, displaying the entire video sequence or a particular image sequence of the video. More specifically, in the example of FIG. 1, the playback control is effected only by controlling the control bar and hence is relatively easy, but an image only at one point in time can be displayed; therefore, it is difficult for the user to grasp the entire contents of the video or access a particular portion of the video.
The example of FIG. 2 is advantageous over the FIG. 1 example in that a plurality of sampled images can be displayed over a certain time width, but since this example is intended for one-dimensional editing on the time axis, it is necessary to lengthen the sampling interval on the time axis for observing the image sequence for a certain period of time or to shorten the sampling interval for observing the image sequence in detail. Accordingly, this prior art example is inevitably complex in operation, and has a difficulty in providing a sense of position on the time axis about a particular image.
The example of FIG. 3 samples images for each occurrence of some event in the video, and hence it is very suitable for grasping the general contents of the video, but in the case of a little longer video, the number of sampled images 24 becomes so large that the scroll bar 25 must be moved up and down for observing the entire video sequence; furthermore, it is also difficult for the user to acquire a sense of place about a particular image in the entire video sequence.
In the case of managing video data in the electronic filing system which manages image data or the like in the form of an album, the first image of the video can be displayed on the album, but for observing the video contents in detail, there is no method other than displaying the video on the video monitor. Of course, it is possible to manually extract representative images from the video and paste them on the album, but this is extremely time-consuming. Thus, there has not been available any method which offers an electronic image book equipped with a book-type interface.
In Taniguchi et al., xe2x80x9cPanoramaExcerpts: Extracting and Packing Panorams for Video Browsing,xe2x80x9d Proceedings of the fifth ACM International Multimedia Conference, Addison-Wesley, pp.427-436, 1997 there is disclosed a user interface which displays representative images in array in one display screen and, upon specifying a desired one of the representative images, plays back the corresponding video sequence. There is also disclosed a method for aligning the representative images in a space-efficient manner within a width-defined screen while keeping the temporal order of the representative images unchanged. However, this user interface screen has its width defined but has its vertical length undefined. That is, the concept of a page having its lateral and vertical sizes defined is not introduced, and accordingly, the above user interface does not constitute a book type interface.
An object of the present invention is to provide a method for creating and utilizing an electronic image book that has a book type interface which solves the above-mentioned defects of the prior art and which automatically or semi-automatically converts a video sequence into a book-type electronic image book, and hence enables a user to grasp the context of video information in its entirety as well as in detail, and a recording medium that has recorded thereon a program for implementing the method.
According to the present invention intended, with a view to settling the above-mentioned problems, an electronic image book bound just like an actual book is automatically created by analyzing a video sequence, detecting various events contained therein, generating a sequence of representative images for each event by a user""s specified method and laying out the representative images in their temporal order.
More specifically, the present invention automatically creates a book-like electronic image book by a procedure of analyzing a video sequence to detect various events such as a scene change, managing the detected events and feature information computed in the process of analysis as video index information and generating a sequence of representative images while referring to the managed video index information according to user""s instructions, and a procedure of laying out the sequence of representative images in their temporal order in a page by a predetermined rule and, if a predetermined condition for a page break is satisfied, laying out the representative images in a new page.
Furthermore, by prestoring link information for partial images, such as elapsed times of images after the beginning of the video, in a page management table, it is possible to play back the partial images associated with a particular representative image specified by a user with a mouse or the like during browsing.
Besides, with a tag function for page control and image display and a function for playing back the original video sequence associated with a user""s specified representative image displayed on page, a functional interface for utilizing the electronic image book enables the user to grasp video information in detail through an image array displayed for each event and allows ease in his recognizing the general flow of the video or its contents as well, for example, by an operation of flipping through the pages. The present invention provides advantages of such a book-type interface and offers a wide variety of methods for accessing particular portions of a video sequence.
Thus, the present invention: (1) permits automatic creation of an electronic image book with a book-type interface from a video sequence; (2) implements video browsing of the electronic image book by laying out the video sequence as a two-dimensional array of representative images; (3) enables a user to make an access to the electronic image book for browsing it for each page and permits quick browsing and efficient and effective storage of information because the representative image arrays are divided for each page; (4) permits a quick access to a desired portion of the video sequence through browsing; (5) offers a book-type interface that permits an access for video browsing and is easy of use even for persons unfamiliar with computers; and (6) makes it possible to introduce schemes for information access effective for documents (for example, tagging, a table of contents, indexes, and so forth) into videos as well.