A mobile camera such as a ubiquitous camera is always carried by a photographer or mounted on a vehicle. In this case, video images are not intentionally taken by the photographer, but sequence video images of ordinary actions of the photographer are unintentionally taken. Therefore, the video images taken by the mobile camera often contains scenes which are not considered as important to a user. It is demanded to realize a system for effectively browsing the video images taken by the mobile camera.
Japanese Laid Open Patent Application JP-P2000-92386A discloses a browsing system for video images of a mobile camera. FIG. 1 is a block diagram showing an example of a configuration of the browsing system for the video images of the mobile camera according to conventional technique. As shown in FIG. 1, this browsing system includes a video image replay section 902, a current position control section 903, an area importance level control section 904, a video image omission section 905 and an video image display section 906. The video image replay section 902 replays the video images of the mobile camera. The current position control section 903 controls where the current position of the replayed video image is located in a moving space. The moving space is divided into a couple of areas, in which the importance level is provided for each area. The area importance level control section 904 controls how important the respective areas are and calculates the importance level of the area to which the current position belongs. The video image omission section 905 omits the video image in accordance with the importance level of a unit area when the video image is replayed. The video image display section 906 displays the video images which is normally replayed or replayed through the omission.
Another browsing system is also disclosed in “T. Ueda et al., “Digest Creation of Video Data recorded by Wearable Camera using Location and Geographical Information”, Technical Report of the Institute of Electronics, Information and Communication Engineers, Vol. 101, No. 193 (DE2001-101), pp. 175-182, July 2001”. According to this video image processing system, the importance level of a scene is calculated by a distance and a direction between a photographing position and a landmark so that a scene whose importance level is higher than a predetermined threshold value is exclusively extracted and provided.
According to the above mentioned conventional browsing system, the importance level of the video image is detected on the basis of the photographing position. Consequently, a scene with the high importance level is exclusively extracted, and the replay speed is changed in accordance with the importance level.
In the case of using the ubiquitous camera and the like, the video image is not intentionally taken by the photographer, but sequence video image of the ordinary actions of the photographer are unintentionally taken. Therefore, when the unedited video image is replayed, a person browsing the video image may have an unpleasant impression. For example, in the video image which was taken when the photographer significantly changed a line of sight leftward and rightward or when a car was turning to right or left, there is a large sway of the video images between left and right. Such a video image scene may give an unpleasant impression to the browsing person when it is replayed at high speed. There is also a case that a scene having no variations such as a traveling scene in traffic congestion makes the browsing person feel bored when it is replayed at low speed.
An important scene in the video image by the mobile camera is not determined only by a relative position against an object. Since a photographic device itself is moving, there is a case that moving information itself has a profound relationship with the important scene. Because actions such as turning and making a stop on the street contain an intention of a walker or a car driver, the importance level of the video image is increased in the case of turning and making a stop on the street.
However, in the above described conventional browsing system, the importance level is determined simply on the basis of the distance/direction between the photographing position and the landmark. Therefore, it is impossible for the conventional browsing system to appropriately determine a detailed type of the video image scene. It is impossible to appropriately replay the video image in accordance with the type of the video image scene either.
Moreover, the criteria of the importance level in the moving video image is changed based on the purpose to browse the video image by the browsing person. Since the criteria to determine the importance level is fixed in the conventional browsing system, it is difficult to correspond to the change of the purpose by the browsing person. For example, when the purpose is to confirm a route to reach a destination, the video image of a whole route is important and a scene to turn the corner and the like is particularly the most important. Meanwhile, when the purpose is to review the video image of a drive in the past, the landmark such as a mountain and a building which were along the road is important.
The following techniques are known as conventional techniques related to the process of the video images.
An video image processing device disclosed in Japanese Laid Open Patent Application JP-P2002-142189A includes an input means, an evaluation means and a processing means. The input means inputs a data stream having moving video image data. The evaluation means divides the moving video image data into a plurality of scenes and evaluates the moving video image data in the respective scenes according to a plurality of conditions which are mutually different. The processing means selects the scenes based on the evaluation result by the evaluation means.
Japanese Laid Open Patent Application JP-P2002-271733A discloses a device to generate a digested video image of a soccer game. This digest generating device divides the video image of a soccer game into a plurality of time zones and determines the state of each team (offensive state, defensive state, etc.) in each time zone to generate information indicating the progress of a match. The digest generating device then extracts the video image scene by using the generated information and the importance level.
Japanese Laid Open Patent Application JP-P2002-330393A discloses a video image digest generating method to determine importance of significance at a point of time on the video image of a program and dynamically calculate the importance level while receiving information additive to the program. Japanese Laid Open Patent Application JP-P2003-87712A also discloses a method for generating a digested sports video image which is capable of capturing a progress of the match and determining the importance level of phenomena.
Japanese Laid Open Patent Application JP-P, Heisei 08-95596A discloses a method capable of grasping the outer shell and the atmosphere of the video image and sound in a short period of time by using sound information. This method includes step for inputting video image information composed of video image information and sound information, step for extracting various kinds of feature variables from the sound information among the inputted video image information, step for processing the video image information based on the extracted feature variables, and step for outputting the generated video image information by the processing.