1. Field of the Invention
The present invention relates to generation of edited video data obtained by connecting a plurality of pieces of video data.
2. Description of the Related Art
Recently, a moving image capturing function is provided for various devices such as a digital video camera, a digital camera, a digital single lens reflex camera, a mobile phone, a personal digital assistant, and a portable music player. An increase in capacity advances in a data storage medium such as a hard disk drive or a memory card, and the amount of private video data becomes considerable. Further, with a service for uploading and mutually viewing video data on the Internet or TV compatible with network, an opportunity for viewing captured and stored video image is increasing.
When the captured video image is viewed, time corresponding to shooting time is required. Then, a user selects video data to be viewed using a cue function referred to as a chapter menu or thumbnail display based on the unit of a file. If the video data contains an unnecessary interval, the user performs fast forward. Since such operation is troublesome, many users edit the video image. The user connects parts (hereinafter, referred to as cuts) having the video data to be viewed in the edition of the video image. As needed, graphic video images are combined to the cut, a video effect called transition is applied to a connection portion between the cuts, or back ground music (BGM) is added, thereby creating the edited video data.
However, the editing of video image requires specialized knowledge for editing or a knowledge about how to use a video editing tool. Thus, the editing of video image is difficult for a general user. Even if the user has a knowledge on editing or the editing tool, such an operation is troublesome that necessary video data is selected from enormous amounts of video data and the cut of the video data is further selected.
Then, as a technique for easily editing a video image, a technique for automatically performing editing of a video image based on a template file is discussed. A material frame to which the video data is inserted, an effect, BGM, and transition are defined in advance in the template file according to a time code. A video editing apparatus analyzes metadata of each material frame of the template file, further analyzes metadata of the video data, and inserts the video data having matching metadata into the material frame of the template file (see U.S. Patent Publication No. 2009/0052734).
Further, a technique is discussed that a scene for capturing an object is automatically extracted and edited by analyzing a motion vector between video frames and determining an area in which motion vectors with the same size in the same direction are concentrated as an object being tracked. The edited video data contains a plurality of scenes. With this technique, not only the scenes are sequentially arranged in the edited order but also the scenes are rearranged based on the importance degree or similar scenes are connected or put together (see U.S. Patent Publication No. 2008/0123966).
Japanese Patent Application Laid-Open No. 2006-139382 discusses a technique that analyzes video data obtained as a search result under an analysis rule corresponding to a search condition, and determines display contents as the search result based on a display rule selected in connection with a display pattern and the analysis result.
However, in the method discussed in U.S. Patent Publication No. 2009/0052734, the video data including the matching metadata is inserted to the material frame of the template file. Therefore, the connection between the inserted video data may be not preferable. For example, a position of an object suddenly may jump between the video data pieces, or the movement direction of the object can be suddenly opposite.
According to the method discussed in U.S. Patent Publication No. 2008/0123966, although similar scenes are connected based on the motion vector, it is not possible to prevent the jumping of the object position between the video data. Even similar scenes do not make sense in the connection, so that an unnatural video image may be generated.
With to the method discussed in Japanese Patent Application Laid-Open No. 2006-139382, although the connection of the video data is considered to make sense with the display pattern, the object position may jump. Since the analysis is performed based on the unit of the video data, the cuts cannot be connected to make sense. For example, a cut of an object over a line-of-sight is connected just after a cut in which the line-of-sight is directed thereto in the video data. The video image having an unnatural connection make a viewer confused, and the viewer cannot understand contents.