Public venues such as shopping centres, parking lots and train stations are increasingly subjected to surveillance with large-scale networks of video cameras. Application domains of large-scale video surveillance include security, safety, traffic management and business analytics. One example application is to have a pan, tilt and zoom camera, Camera A, tracking a target object on site. When the target object is about to move out of the physical viewing limit of Camera A, another camera, Camera B, in the same network is assigned responsibility to take over tracking the object. The change in responsibility from Camera A to Camera B is referred to as a “handoff” process. The handoff process is used with cameras with overlapping field of view. In handoff, a key task is to perform rapid and robust object matching given images from two overlapping camera views. If the field of views of the cameras A and B do not overlap, either spatially or temporally, a similar process called “object re-identification” may be performed.
Object matching from different camera viewpoints is difficult. Different cameras operate on different lighting conditions. Different objects may have similar visual appearance, and the same object (e.g., a person or a subject) can have different pose and posture across viewpoints.
One image processing method performs appearance-based object matching. Appearance-based object matching involves first determining features of a target object from a first view, then determining the same type of features of a candidate object from a second view, and then comparing the difference. If the difference is smaller than a threshold, the target object and the candidate object are said to match. Otherwise, the target object and the candidate object do not match.
One type of feature used to perform appearance-based matching is to determine the vertical symmetry of a foreground object such as an upright person. Methods that build on top of such symmetry features explore the fact that the foreground objects possess a vertical symmetry in nature. In one such method, a foreground object mask is first determined by segmenting the foreground object from the background. Next, based on a foreground object mask, the foreground object is dissected vertically into head, torso and legs sections. Then, a symmetrical axis for each section is found, resulting in a number of disjointed vertical symmetry axes. Finally, local features such as colour histograms, blob regions, and texture features around the symmetrical axes from different view points are compared and a decision is made based on the difference. One problem with the appearance-based matching method described above is that the method requires a foreground mask of the object within a bounding box. Hence the accuracy of finding a medial axis depends on the accuracy of foreground segmentation.
In addition, determining a foreground mask may be computationally and memory expensive, which may prevent the appearance-based matching method used in a resource-limited hardware environment. Another problem with the appearance-based matching method described above is that the vertical symmetry axes are broken into disjoint paths. The vertical symmetrical axes can only be detected well when the foreground object has a reasonably upright posture. The vertical symmetrical axes cannot be detected correctly when the foreground objects are tilted or foreshortened with respect to a camera sensor.
Another method uses the symmetrical nature of foreground objects by determining overall symmetrical structure of the objects (e.g., vehicles). In such a method, edge points are first determined by performing horizontally scan of an entire image. Next, a histogram of midpoint locations of all edge point pairs is determined. Then, the object midpoint is determined from a global peak of the histogram. Finally, a bounding box containing all edge-point pairs sharing the common object midpoint is determined. The determined bounding box and corresponding edge-points can be subsequently used for matching purposes. One problem is that the overall symmetrical structure method may not work for non-rigid objects like human beings. Yet another problem with known image processing methods is that the methods do not provide a confidence measure of the estimated symmetrical axes for later matching purposes.