CNNs may be trained to perform various tasks on various types of data. For example, CNNs may be trained to receive data related to documents, and may be trained to perform document classification. As another example, CNNs may be trained to perform computer implemented visual object classification, which is also called object recognition. Object recognition pertains to the classifying of visual representations of real-life objects found in still images or motion videos captured by a camera. By performing visual object classification, each visual object found in the still images or motion video is classified according to its type (such as, for example, human, vehicle, or animal).
Automated security and surveillance systems typically employ video cameras or other image capturing devices or sensors to collect image data such as video or video footage. In the simplest systems, images represented by the image data are displayed for contemporaneous screening by security personnel and/or recorded for later review after a security breach. In those systems, the task of detecting and classifying visual objects of interest is performed by a human observer. A significant advance occurs when the system itself is able to perform object detection and classification, either partly or completely.
In a typical surveillance system, one may be interested in detecting objects such as humans, vehicles, animals, etc. that move through the environment. However, if for example a child is lost in a large shopping mall, it could be very time consuming for security personnel to manually review video footage for the lost child. Computer-implemented detection of objects in the images represented by the image data captured by the cameras can significantly facilitate the task of reviewing relevant video segments by the security personnel in order to find the lost child in a timely manner. For increased accuracy, different CNNs that comprise part of the surveillance system may be trained to perform different tasks (for example, one CNN may be trained to recognize humans and another CNN may be trained to recognize vehicles).
That being said, computer-implemented analysis of video to detect and recognize objects and which objects are similar requires substantial computing resources especially as the desired accuracy increases.