1. Technical Field
The embodiments herein generally relate to image processing techniques, and, more particularly, to a detection system and method for detecting an object and an activity associated with the object.
2. Description of the Related Art
Detection of human and activities of the human for indoor and outdoor surveillance has become a major domain of research. It has been observed that, the detection of the human and the activities is very effective in applications like video indexing and retrieval, intelligent human machine interaction, video surveillance, health care, driver assistance, automatic activity detection, and predicting person behavior. Some of such applications may be utilized in offices, retail stores, or shopping malls in order to monitor/detect people (and activities of the people) present in the offices, the retail stores, or the shopping malls. It has been further observed that, the detection of the human and their corresponding activities through still images or video frames may also be possible in the indoor surveillance.
In order to detect the human and activities associated with the human in the still images or the video frames, traditional background modeling based methods have been implemented. However, such methods are not capable of detecting RGB-D/grayscale data along with other information, pertaining to each pixel in the still images or the video frames. This is because, the camera capturing the still images or the video frames is not static or there is constant variation of lighting/environmental conditions around the camera. Further, since the video frames may contain a human leaning over a wall, or the person occluding on another person, it may be challenge to distinguish the person from the wall or the other person, thereby leading to incorrect/inaccurate detection of the human and an activity corresponding to the human.
In addition, there have been other techniques implemented for detecting the human in the still images or video frames. Examples of such techniques include ace detection algorithm integrated with cascade-of-rejectors concept along with histogram of oriented gradients (HoG), window scanning technique, human detection based on body-part, hierarchical classification architecture using SVM, graphical model based approach for estimating poses of upper-body parts. However, these techniques focus on detecting one or more body parts (e.g. head, leg, arm, etc.) of the human. Additionally, these techniques require the human in the image/video frame to be localized in a predefined orientation/view, and hence are not capable of detecting the human or corresponding activities for view-invariant.