Automated recognition of activity is a challenging problem with a wide range of applications. For example, video surveillance cameras may be used to monitor an environment where an activity is taking place. Wearable technologies such as body cameras, smart watches and camera-equipped eyewear make it possible to capture human activities from an egocentric or first-person perspective. Progress in wearable devices has resulted in the development of on-body sensors that are capable of collecting a variety of data descriptive of the motion parameters of the user. For instance, various smartwatches are equipped with an accelerometer, a gyroscope and/or a compass.
Joint processing of multimodal data acquired by simultaneous use of two or more different sensors can lead to a decrease in uncertainty about the acquired data and automated decision processes (e.g., object and activity classification and recognition, anomaly detection, etc.) based thereon, particularly when compared with scenarios where only one data modality is available. The synergistic combination of multiple types of data is termed multimodal data fusion, and a variety of approaches including early (e.g., at the feature-level) and late (e.g., at the decision-level) fusion schemes have been proposed. However, existing fusion schemes are still often not accurate or as useful as they could be in aiding decisions and in classifying human activity. This is particularly an issue in the healthcare field, where it is important that human actions be properly classified, and that recommendations be accurate.
This document describes devices and methods that are intended to address issues discussed above and/or other issues.