It is relatively easy for the human brain to recognize and/or detect certain actions such human activities within live or recorded video. For example, in a surveillance application, it is easy for a viewer to determine whether there are people in a given scene and reasonably judge where there are any unusual activities. In home monitoring applications, video can be used to track a person's daily activities, e.g., for tele-monitoring of medical patients or the elderly.
It is often not practical to have a human view the large amounts of live and/or recorded video that are captured in many of the scenarios where video is used. Thus, automated processes are sometimes used to automatically distinguish and detect certain actions from others. However, automatically detecting such actions within video is difficult and overwhelming for contemporary computer systems, in part because of the vast amounts of data that need to be processed for even a small amount of video.
Recently developed feature point-based action recognition techniques have proven to be more effective than traditional tracking-based techniques, but they are still computationally expensive due to the task of processing the large number of feature points. As a result, applications requiring fast processing, such as real-time or near real-time surveillance or monitoring, have not been practical.