Field of the Disclosure
The present disclosure relates to an information processing technique for tracking an object in an image.
Description of the Related Art
Methods and apparatuses for tracking an object in continuous images or a moving image (hereinafter, such images will be referred to simply as an “image”) by using the image. There is a tracking method using image features among the conventional methods. According to the tracking method, a position of an object to be tracked is identified by performing movement prediction of the object and matching processing for comparing features of the object with visual features of an input image.
Such a tracking method is not able to continue subsequent tracking processing and ends tracking if the object to be tracked goes out of an image frame. If the object to be tracked hides behind a wall or goes out a door, the object within the image frame can become unable to be imaged, in which case the object disappears from the captured image. In such a case of disappearance within the image frame, the reliability of matching of image features typically drops. If the reliability falls to or below a certain value, the tracking is ended.
Concerning such a tracking end determination, Japanese Patent Application Laid-Open No. 2013-25490 discusses a tracking method for associating tracking information about an object and a detection result of the object between image frames, the tracking method including determining whether to continue, suspend, or end tracking. According to such a method, a degree of correspondence is calculated when associating the detection result of the object and the tracking information. A continuation determination of tracking is made possible by using the value of the degree of correspondence. More specifically, various determinations and processes are performed depending on the value of the degree of correspondence. Possible determination processes include ones for (1) continuing tracking the tracking object, (2) once suspending tracking because the tracking object is temporarily hidden, and (3) ending the tracking processing because the tracking information and the detected object do not sufficiently correspond to each other.
According to the typical technique for determining an end of tracking by using the reliability of matching, the reliability may fail to fall quickly if the target object disappears. Since the reliability does not fall, the tracking can continue to cause a phenomenon called “drift” in which the tracking object continues to be searched for in a position where there is no object.
FIGS. 10A to 10D illustrate such a situation. FIGS. 10A to 10D illustrate transitions when the time elapses in order of FIGS. 10A, 10B, 10C, and 10D. FIG. 10A illustrates a tracking object 1001, such as a person, and a tracking region 1003 around the tracking object 1001. The tracking region 1003 is a rectangular region including the person serving as the tracking object 1001. As the time elapses from FIGS. 10A to 10B and to 10C, the tracking object 1001 partly hides behind an obstacle 1002. In FIG. 10D, the entire tracking object 1001 hides behind the obstacle 1002. Suppose that the obstacle 1002 and the tracking object 1001 do not have a large difference in terms of image features. In such a case, as described above, the reliability of matching does not fall quickly and the tracking continues. In FIG. 10D, the tracking region 1003 still remains. Such a phenomenon is called “drift.” Examples of the case where there is no large difference in image features include when the tracking object 1001 is a person dressed in white and the obstacle 1002 is a white wall. Because of the disappearance of the tracking object 1001 from the image frame, the tracking is supposed to be suspended or ended. However, the occurrence of the foregoing drift delays the determination to suspend or end the tracking processing. In FIG. 10D, since the tracking object 1001 is not captured in the image, the position of the tracking object 1001 is not shown in the tracking region 1003, whereas the tracking region 1003 is located in a position where features are closest to those of the tracking object 1001.
The method discussed in the foregoing Japanese Patent Application Laid-Open No. 2013-25490 associates the tracking region of an object and the detection result of the object between frames. The tracking accuracy therefore depends on the detection result of the object. Possible methods for object detection include a method for detecting a moving object from a difference between a captured image of the background and an image of the frame of interest, and a method for learning an image of the object to be detected to generate an identifier and performing detection by using the identifier. According to the method using a difference from the background image, the detected object, in some frames, may be segmented due to discontinuous pixels. The detected object may be connected with another moving object to increase in size. In such cases, the degree of correspondence between the tracking information so far and object detection information can drop significantly.
FIGS. 11A to 11D illustrate such a situation. FIGS. 11A to 11D illustrate transitions when the time elapses in order of FIGS. 11A, 11B, 11C, and 11D. FIG. 11A illustrates a tracking object 1101, such as a person, and a tracking region 1103 (in FIG. 11A, rectangular region) around the tracking object 1101. In FIG. 11B, hatched rectangles represent pieces of difference information 1104 relative to the background. According to the technique using a difference from the background, the difference information 1104 serving as portions to be detected does not necessarily exactly coincide with the tracking object 1101 but may overlap only in part. As illustrated in FIG. 11B, the pieces of difference information 1104 may be detected as a plurality of separate objects. Such a phenomenon can occur, for example, if the person to be tracked assimilates to the background. This results in insufficient correspondence between the detected plurality of objects and the tracking object 1101. As a result, the tracking processing can be suspended or ended by mistake even if there still is the tracking object 1101 in the image frame of FIG. 11B. In another example, in FIG. 11C, the tracking object 1101 is almost entirely hidden behind an obstacle 1102. The difference information 1104 is still detected because part of the tracking object 1101 is included in the image frame. In FIG. 11D, the entire tracking object 1101 hides behind the obstacle 1102, and no difference information 1104 is detected. In such an example, the tracking object is supposed to be suspended or stopped at FIG. 11D where the difference information 1104 is not detected, whereas the tracking processing can be stopped at the timing of FIG. 11B. In other words, the tracking processing can be suspended or ended too early.
For the foregoing reasons, according to the conventional techniques, a determination to suspend or end the tracking processing can be delayed by the occurrence of a drift if the tracking object in the image disappears. Moreover, an erroneous determination to suspend or end the tracking processing can be made because of poor correspondence between the tracking object and the past tracking information. This has resulted in low accuracy of the determination to continue the tracking processing.