The proliferation of traffic and surveillance cameras and the increasing need for automated video analytics technologies have brought the topic of object tracking to the forefront of computer vision research. Real-world scenarios present a wide variety of challenges to existing object tracking algorithms including occlusions, changes in scene illumination, conditions and object appearance (color, shape, silhouette, salient features, etc.), as well as camera shake. While significant research efforts have been devoted to solving the general problem of robustly tracking groups of objects under a wide range of conditions, the environments encountered in traffic and surveillance situations are typically limited in scope with respect to directions and speeds at which objects move. Examples of implementations that rely on robust object tracking include video-based parking management and video-based vehicle speed estimation, measuring total experience time in retail spaces, and the like.
The aforementioned real-world scenarios present a wide variety of challenges to existing object tracking algorithms. An example of such a scenario is the use of a fish eye camera to determine “total experience time” of a vehicle in a drive-thru setting, i.e., an ultra-wide-angle lens that produces a hemispheric view of a scene created via the introduction of a lens that has a shape and index of refraction that captures all light forward of the camera and focuses it on the CCD chip. Two key issues that affect performance of appearance-based object tracking in video streams are (i) change in apparent size of an object due to perspective and/or distortion, and (ii) change in appearance of an object due to its orientation relative to the camera. For example, due to the projective nature of a camera, objects farther away from the camera appear smaller than objects closer by; this applies to both rectilinear and fisheye lens cameras. In addition, fisheye lenses usually introduce extreme barrel distortion in order to achieve wide angles of view. Barrel distortion results in spatially varying image magnification, wherein the degree of magnification decreases with an object's distance to the camera's optical axis. As another example, objects that are longer along one dimension than along others and that change orientation as they traverse the field of view of the camera are perceived to go through changes in aspect ratio, even in the absence of lens distortion.
While fisheye distortion is an extreme case of barrel distortion, usually associated with wide angle imaging systems, other types of distortion also occurs in imaging systems. For instance, telephoto lenses often possess pincushion distortion, where magnification increases with distance from the optical axis. A zoom lens, as those used in common PTZ (Pan-Tilt-Zoom) surveillance systems, can operate along a continuum from wide angle to normal (rectilinear) to telephoto, and possess respective distortions. Anamorphic optical systems may be used to form a panoramic view of a scene, where the distortion will differ in perpendicular directions.
Current attempts to estimate object size and orientation in addition to object location can be error-prone and may have increased computational complexity due to the higher-dimensional optimization space in projective and optically induced distortion.
Thus, it would be advantageous to provide an efficient system and method for video-based tracking of an object of interest that exploits the regularized conditions present in transportation scenarios to achieve robust and computationally efficient tracking that has object orientation and size awareness.