Object counting is calculating number of objects in and out of an area or crossing a counting line through specific devices. The wide range of its applications includes, such as in environments of buildings, roads, shopping mall, or public transportation systems. It may timely master number of people or vehicles in specific areas through object counting, for number controlling people in and out of buildings, road traffic control or measuring public facilities utilization, etc. Common methods of object counting are such as by gate counters, infrared sensors, and video analysis.
Gate counter technology is to drive a counter by pushing rotating railings when an object through a gate. This technology may accurately count objects passing through the gate. However when an object passes through a gate, the object speed requires be reduced. Infrared sensor technology sets up an infrared sensor in the lateral side of an entrance or an exit, and uses the characteristic of infrared-breaking to estimate the number of objects when an object passes through the entrance or the exit. When objects are in and out side by side, the shading among the objects may induce an object counting error. Video analysis technology uses video cameras to shot a counting area, and uses an object detection and object tracking method to label the coordinates of an object to determine if its trajectory is in and out of an area or across a counting line. When detecting an object, object counting may be easily affected by the light source, and when tracking multiple objects, situations of object occlusion, object merge, or object isolation, etc. may also easily induce misjudge.
Video analytics based techniques usually mount photography-related devices on the top of a scene and look down for taking pictures, and then use a variety of different image recognition and processing technologies to achieve object counting. The area estimation method detects variation pixels over a video frame and labels out the area where the object locates at, then combines with object tracking to know the timing that an object triggers a cross-line event, and estimates the number of objects with an analysis of statistics of the occupied area of the objects. For example, one relevant technique tracks an object over a video frame. And, when an object enters into a counting area, the passing objects are counted by combining with the area of the object's motion pixels projected on an image in both X and Y directions.
Another technique of a related literature as shown in FIG. 1, utilizes image preprocessing and feature extraction to cut a to-be-tested image 110 into an image with a plurality of grids 120, and supplemented with a variety of machine learning methods to analyze the number of objects and the relationship among the grids. When an object crosses a base line, this technique determines if there is any object in the image 120 according to the information of the grids' variation, as shown in label 130, and estimates the count of objects. A technique of another related literature uses an algorithm to cut an object into multiple regions with approximate areas to estimate the number of objects when the object crosses a base line.
Another technique using a template matching method defines an object template, and uses a template matching scheme to superpose the area where the object locates at. It also tracks the moving trajectory of the object to determine its direction and whether a cross-line event has occurred, thereby achieving the object counting. Another technique uses image edge information to establish a local pedestrian template. And, when a pedestrian enters a scene, this technique uses a similarity matching scheme to verify if there is a pedestrian and counts the pedestrians. Another related technique performs object detection and tracking by using head shapes such as round or oval, and color information. Some of techniques approximate a foreground area in an object frame with a polygonal template. Some of techniques approximate an object block in an object frame with a convex polygon template.
In order to avoid the accuracy of the object counting being affected by the appearance of changed area of the object, some related techniques use a pre-trained object detection classifier to detect the portion containing a specific object in the images, such as capturing a skin region, a head region, or a facial region. These techniques also combine with object tracking and similarity matching to determine if there is any triggered cross-line event, and count the objects.
A technique using an optical flow method performs a large number of operations to calculate motion vectors of two images with the characterization generated by the object moving, and counts the objects with the speed and direction information contained in the motion vectors. For example, a related technique determines the number of pedestrians according to peaks of the light flow. For the positions indicated by arrows 210, 220, 230 shown in FIG. 2, there are three peaks of the light flow, and determine that three pedestrians passed through.
In order to effectively avoid the occlusion issue among the pedestrians, some related techniques use multiple cameras with different angles to take pictures. These techniques also calculate the corresponding relationship among the pedestrians by using the geometric relationship of the cameras, and then estimate the direction and count the number of pedestrians crossing a base line. Some related techniques use dual cameras to obtain image depth information to determine the number of persons crossing a base line. Some commercial products use thermal images as resources of taking pictures, to improve the accuracy for object detection and tracking.
In the application technologies of object counting, it is very worthy of study and development that object counting may be performed normally in a scenario with many objects standing side by side or oppositely crossing a base line, and with a high accuracy without using a specific object template, without confirmed detection of independent objects in the image frame, and without complicated procedures for object labeling and tracking,