American football and soccer are competitive sports, especially popular in Europe and the United States.
In the fields of American football and soccer, analyzing video obtained by capturing a game, and providing the result of the analysis as a feedback to a practice or the next game or creating a highlight video have been actively carried out.
However, in an actual game, many periods are less important in terms of game analysis, and it takes great time costs to retrieve necessary parts from a long-time game video.
In an American football game, a period of time when offense and defense actions called “down” are performed (hereinafter, referred to as “play”) and a period of time when the offense and defense actions are not performed are repeated. In other words, a period having a high degree of importance in terms of the analysis of an American football game is a section of play. Accordingly, it is desired that it is possible to extract efficiently and accurately at least one of a start point and an end point of the section of play, from the video data obtained by capturing an American football game.
In recent years, a study on the analysis of video obtained by capturing sports games (hereinafter, referred to as “sports video”) has been actively conducted.
As a technology related to the analysis of sports video, there are a video summarizing method of extracting important sections from a long-time game video and creating a highlight video automatically, a tactic analysis method of analyzing the tactic and attack pattern of each team of the game by recognizing a formation, and the like. Further, in order to realize such contents, research of a video analyzing method has also been actively carried out which accurately extracts information about players or a ball from video data, in view of each player behind other players and a change in an illumination condition.
For example, an example of the video summarizing method which has been proposed conventionally includes a method of extracting the start point of the play of an American football game, based on the feature such as the color (hue, saturation, brightness, or the like) of a video and the camera work (for example, see PTL 1). Further, there is also a method of creating a highlight video by calculating a degree of importance in a sports video, from the contents written in the twitter (registered trademark) or the amount of posts within a fixed time, and determining a key frame (see NPL 1).
Further, examples of the tactic analysis method which has been proposed conventionally include a play analysis method of recording the behavior of a player during a game (for example, see NPL 2), and a tactic analysis method of recording the behaviors of all players of a team (for example, see NPL 3). In addition, the examples also include replay of a highlight scene, or creation of video of a certain player at a start point. In addition, there is also a formation recognition method of classifying a formation type, by automatically detecting a scrimmage line, which is an initial formation, from the video obtained by capturing an American football game (for example, see NPL 6).
Therefore, it is considered that important parts of a game are extracted from a video of an American football game, by using these related arts.
However, in the method described in PTL 1, there is a risk that accuracy decreases due to the color environment of video and camera work. Further, in the method described in NPL 1, since it is necessary to use media information other than the sports video which are written in twitter (registered trademark), it is possible to cope only with a large-scale broadcast video such as terrestrial video. Further, in the methods described in NPL 2, and NPL 3, it is necessary to use a plurality of camera videos, or manually perform the detection or tracking of players and a ball. Further, in the method described in NPL 4, since only the information of the initial formation, of which detection is relatively easier, is extracted, it is insufficient as an information quantity for tactical analysis.
That is, even if the related arts are used, it is difficult to extract a play section from a video obtained by capturing a sports game, efficiently and with high precision.
An object of the present technology is to provide a video processing method and a video processing device, capable of extracting a play section from a video obtained by capturing a sports game, efficiently and with high precision.