Within the field of video surveillance, and video analysis, action recognition is an area of development. The purpose of action recognition is to enable automatic detection of actions performed by objects in a surveilled scene, generally human actions. An example could be that while a perimeter sensor could generate an alert when an individual or object is entering a particular area, action recognition may generate an alert when the same individual assumes a hostile pose. The development is catalyzed by improved image processing techniques, deep learning etc, enabling more advanced processing in shorter times.
There are several known techniques for action recognition, and they may be divided into spatial techniques and temporal techniques, wherein spatial techniques, for example, include various ways of classifying poses in a still image (or a single image frame) while temporal techniques typically involve the evaluation of a chain of events, for example, evaluation of poses in a series of image frames. Neural networks may be utilized for performing the action recognition, yet state of the art techniques also include support vector machines and the like. The techniques used for action recognition are not the focus of the present application, yet some more examples will be given in the detailed description.
In a modern surveillance situation, it is often the case that a vast amount of information is collected to a control center, where one or more operators review the information live. The information, typically includes a number of live video streams. More often than not, there is a recording function enabling review of the information afterwards, but there is a benefit in being able to react momentarily to unwanted actions occurring in one of the surveilled scenes rather than only being able to analyze in retrospect. In theory it would be possible to track and analyze everything being caught by the video camera. In reality both the human factor and current limitations in processing performance make such an approach, (i.e., live evaluation of all information caught by a number of video cameras), unrealistic.
The present teachings aim to provide an improved operator support, in particular for complex surveillance situations.