Systems and methods have been developed for defining an object in video and for tracking that object through the frames of the video. In various applications, a person may be the “object” to be tracked. For example, sports images are interested in following the actions of a person such as the players and/or the referees.
Players and referees are displayed in sports videos. Localization and labeling of them can be done in IPTV systems so that a regular TV broadcast (MPEG-2/-4) is augmented with additional information (MPEG-7 encoded) that defines those objects in the video, along with additional content to be displayed when they are selected. Specification of objects with additional content (metadata) is usually implemented by an authoring tool that includes such functions as extraction of shots and key frames, specification of the interactive regions, and tracking of the specified regions to get the region locations in all frames.
Team classification-based interactive services by clicking the player in hypervideo or iTV has been discussed. Team information search and retrieval and team data (statistics results, articles and other media) can be linked assuming the player can be localized by the interaction service system. Various methods for locating the players/referees can be split in two groups. The first group makes use of fixed cameras (usually they are calibrated in advance) in a controlled environment while the second group uses only regular broadcasting videos. While the former can provide better performance, the latter are more flexible. In the second group, some approaches tried to overcome difficulties by finding the playfield first, using color segmentation and post-processing with morphological operations, such as connected component analysis, in order to limit the search area.