Computer vision is a field that includes methods for acquiring, processing, analyzing, and understanding images and, in general, high-dimensional data from the real world in order to produce numerical or symbolic information, e.g., in the forms of decisions. One of objectives of computer vision is to duplicate the abilities of human vision by electronically perceiving and understanding an image. This can be seen as disentangling of symbolic information from image data using models constructed with the aid of geometry, physics, statistics, and machine learning theory. Computer vision has also been described as the venture of automating and integrating a wide range of processes and representations for vision perception.
Sub-domains of computer vision include scene reconstruction, event detection, video tracking, object recognition, machine learning, indexing, motion estimation, and image restoration.
Computer vision can be employed in road traffic surveillance, for example to estimate dimensions of vehicles, e.g. to allow automatic collection of fees, to identify oversize vehicles that exceed allowable dimensions defined by law or to identify vehicles that cannot enter some areas such as tunnels, passes under bridges etc.
A U.S. Pat. No. 8,675,953 discloses an electronic device that determines a geometric scale of an object using two or more images of the object. During operation, the electronic device calculates the size of the object along a direction using multiple images of the object that were taken from different perspectives (such as different locations and/or orientations in an environment) along with associated imaging-device characteristics. For example, the size of the object may be calculated using the images, the associated focal lengths of a digital camera that acquired the images, and the law of cosines. Using the scale of the object, an image of the object may be appropriately scaled so that it can be combined with another image.
A publication “Vehicle Size and Orientation Estimation Using Geometric Fitting” (Christina Carlsson, Department of Electrical Engineering, Linköpings universitet, Linköping, Sweden (ISBN 91-7219-790-0)) discloses a vehicle size and orientation estimation process based on scanning laser radar data.
Active Appearance Model (AAM) is a technique which exploits deformable model matching into an object's image. Originally it was developed for face detection but it has been proved that the technique is useful for various kinds of objects. The AAM consists of two parts: shape and appearance (texture). The shape is defined by a set of points which are grouped into multiple closed polygons, while the appearance (texture) consists of all pixels that lie inside the defined shape polygons.
A publication “Vehicle Dimensions Estimation Scheme Using AAM on Stereoscopic Video” (R. Ratajczak et al, 10th IEEE International Conference on Advanced Video and Signal-Based Surveillance, AVSS 2013, Workshop on Vehicle Retrieval in Surveillance (VRS) 2013, Kraków, Poland, 27-30 Aug. 2013, pp. 4321-4325) presents a method for object dimension estimation, based on active appearance model (AAM) and using stereoscopic video analysis. The method includes a search for the same vehicle in both images of a stereo pair independently, using a view-independent AAM. It is assumed that after AAM search step, particular points of the fitted models indicate 2D locations of the same 3D point in both images, which allows to simply calculate disparities between those corresponding model points fitted in both images.
The known methods that allow to determine the metric dimensions of objects require use of cameras calibrated to allow precise dimension measurement or laser scanners. For example, an image registered by two cameras aligned in parallel at a distance of 2 m from each other and observing an object distanced by 1 m having a size of 1 m will be the same as an image registered by two cameras aligned in parallel at a distance of 20 m from each other and observing an object distanced by 10 m having a size of 10 m. This is a problem of a scale of the camera system.
It would be advantageous to present a cost efficient and resource efficient system for object dimension estimation.