Human-action detection and classification in images and video can be challenging because of camera motion, occlusions, and significant intra-class variations such as people posture, motion and illumination. Some action recognition approaches consider human actions as simple “moving objects” and are based on low-level image features (e.g., gray scale gradients, color and edges, foreground silhouettes and optical flow). These representations can work suitably well on capturing global motion of the full body and for relatively different and coarse actions such as walking, running, and jumping. But such representations may not be sufficient for detecting and recognizing more complex and subtle actions such as a person talking on the telephone, eating, working with a laptop computer, and other actions.