Camera-based surveillance has gained immense popularity because of rising concerns for security and safety. Surveillance cameras are typically video cameras, often denoted CCTV (Closed-circuit television) cameras, which are used for the purpose of observing an area or scene. The cameras are often connected to a recording device or IP network, and the images generated by the cameras may be displayed for observation by a security guard or law enforcement officer.
As surveillance cameras are gaining in popularity, it has become important to reduce the need for human personnel to monitor camera footage. To this end, automatic surveillance systems have been developed to apply advanced computer vision techniques to analyze digital images generated by surveillance cameras for the purpose of identifying, tracking and categorizing objects in the field of view, either in real time or in retrospect. One challenge in this context is that surveillance cameras generate massive amounts of data which need to be processed automatically.
Automatic surveillance systems may analyze the digital images provided by surveillance cameras by facial recognition algorithms, for the purpose of identifying or verifying individuals in the digital images. A large number of facial recognition algorithms are known in the art, e.g. as disclosed in U.S. Pat. Nos. 5,835,616, 5,991,429 and 8,634,601. However, facial recognition algorithms are processing intensive and require the digital images to be of good quality, with respect to image resolution, lighting conditions, image noise, etc. Further, facial recognition may be rendered difficult if shadows are cast on the face of the individual to be monitored, or if the individual partly or wholly conceals the face by turning it away from the camera, by wearing headdress or sunglasses, adding or removing facial hair, etc.
It is also known to analyze the walking style, also known as “gait”, of individuals for the purpose of surveillance, e.g. by use of the algorithms presented in the article “Gait recognition using image self-similarity”, by BenAbdelkader et al, published in EURASIP Journal on Applied Signal Processing, pages 572-585, 2004. A surveillance system that processes images by a combination of facial recognition and gait analysis is known from AU2011101355.
Automatic surveillance systems may also apply so-called Video Content Analysis (VCA) to analyze video, i.e. a time sequence of images, to detect and determine temporal events not based on a single image. A surveillance system using VCA may e.g. detect non-normal behavior of individuals. For example, the system can be set to detect anomalies in a crowd, for instance a person moving in the opposite direction in airports where passengers are only supposed to walk in one direction out of a plane or in a subway where people are not supposed to exit through the entrances.
A specific application of surveillance systems is to monitor a venue for detection of unauthorized individuals, i.e. individuals that have not been approved to access the venue. The venue may be a building, and the surveillance cameras may be installed at entrance points, in enhanced-security areas, or even throughout such a building. This type of surveillance system may use any of the above-mentioned computer vision techniques to detect and track unauthorized individuals. However, the computer vision techniques of the prior art generally have a low ability to properly discriminate between individuals and are likely to generate a high number of false positives, making them less suited or even ineffective for this type of surveillance.