Video equipment for recording media having motion video images recorded therein (for instance, laser disk, VTR and 8 mm-video) has propagated, and the amount of video images accumulated in museums and homes as well as in specialized fields such as radio stations and advertising agencies has also been increasing remarkably. The accumulated video images are not only reproduced, but are also often reused to create new video images by editing. As the amount of video images stored has become tremendous, it has been increasingly necessary to have a method for managing motion video images that can efficiently locate video scenes of interest from a recording medium for reproduction and editing. There is a similar situation in the field of movies, which deals with video films.
In the conventional system for managing motion video images, frame numbers are stored in a recording medium such as a personal computer, and retrieval is performed by a user specifying the stored frame numbers. For instance, the personal computer stores frame numbers in a recording medium. The user directly specifies a frame number or frame time from an alphanumeric input device, or the personal computer displays the still video images of frames having stored frame numbers on a display and the user selects them, thereby specifying the start frame for reproduction. The personal computer performs reproduction on a TV monitor from the specified start frame while controlling the functions provided in the videoplayer, such as frame feed, fast forward and rewind. Thus, the conventional system adopts a method in which video images to be retrieved are directly accessed by means of frame numbers and the like.
Based on such a system for managing motion video images, in Takafumi Miyatake, "Interactive Natural Motion Video Image Editing Technique", FRIEND21 3rd Results Announcement Convention, Jul. 18, 19, 1991, an approach is shown in which motion video images are divided into scenes, and the video images of the leading frames of the respective scenes are displayed on a display to roughly see the scenes to be edited. However, when motion video images that take a long time are to be edited. However, when motion video images that take a long time are to be edited, if for instance, scene change occurs at a frequency on the order of once every two seconds on the average, the number of scenes increases and thus it is difficult to roughly go into all the scenes efficiently. In addition, since the divided scenes constitute direct units of retrieval, it is not possible to retrieve them while grasping the whole structure of the motion video images, since the semantic construction of the scenes becomes complicated.
On the other hand, in the information retrieval system shown in Published Unexamined Patent Application No. 29939/1986, an approach is disclosed in which motion video images are hierarchically classified and still video images representative of the classification are hierarchically displayed as a menu for selection by a user. However, the classification hierarchy must be sequentially traversed from the highest level to the lowest level before motion video images are displayed, and thus retrieval becomes inefficient as the menu hierarchy becomes deeper. Furthermore, the menu hierarchy is fixed and no data management technique is shown for modification.