Techniques to track a person using a video taken by a surveillance camera are disclosed in recent years. As one example of a person tracking method, Patent Literature 1 discloses a method of tracking a person based on a color feature of a person.
FIG. 9 shows an exemplary embodiment of the person tracking system disclosed in Patent Literature 1. The person tracking system includes a person region extraction means 1, a voxel generation means 2, a person color feature extraction means 3, and a person tracking means 4.
The person region extraction means 1 extracts a person region from a surveillance video and outputs a person region extraction result to the voxel generation means 2. The voxel generation means 2 generates voxel information from the person region extraction result output from the person region extraction means 1 and outputs the generated voxel information to the person color feature extraction means 3. The person color feature extracting means 3 extracts a person color feature from the voxel information output from the voxel generation means 2 and the surveillance video and outputs the extracted person color feature to the person tracking means 4. The person tracking means 4 tracks a person using the person color feature output from the person color feature extracting means 3 and outputs a person tracking result.
The operation of the person tracking system shown in FIG. 9 is described in detail.
The person region extraction means 1 extracts a person region from a surveillance video input from a camera using a background subtraction method. Then, the person region extraction means 1 outputs the extracted person region extraction result to the voxel generation means 2.
The voxel generation means 2 generates voxels based on the input person region extraction result. The input person region extraction result is acquired by a plurality of cameras. The voxel generation means 2 projects the input person region extraction result onto the three-dimensional space using a volume intersection method and thereby generates voxels that represent the position of a person in the space. The voxel generation means 2 outputs the generated voxels to the person color feature extracting means 3.
The person color feature extracting means 3 acquires the distribution of colors of a person from toe to tip in the vertical direction as a person color feature based on the generated voxels and the surveillance camera video. Specifically, the person color feature extracting means 3 calculates the average of colors for each height of the voxel, normalizes the result by height, and thereby calculates the person color feature. Although the color feature is basically determined by the color of clothes the person is wearing, the value obtained by calculating the average of colors in all directions at the same height is used. The person color feature extracting means 3 thereby achieves the extraction of the color feature that is robust against variation of the way the clothes look depending on the direction.
The person tracking means 4 compares the obtained person color feature with a person color feature obtained in the past and thereby determines the similarity. The person tracking means 4 calculates the relationship between the voxels calculated in the past and the voxels calculated most recently in accordance with the determination result. Consequently, the person tracking means 4 calculates a person tracking result associating the past person extraction result and the current extraction result.