Interest in affective computing, which involves systems that are capable of recognizing and analyzing the expression of human emotions, has grown substantially in recent years. However, most of the affective computing systems are still research-grade endeavors that are typically not robust enough to handle the demands of real world applications. Nonetheless, the continuing increase in computing power coupled with the miniaturization and proliferation of sensors and mobile devices, is making widespread adoption of affective computing systems in real world day-to-day situations closer than ever.
One useful capability for many affective computing applications is the ability to foretell a user's response to stimuli. Having this ability can help applications offer a better user experience. While currently there are some existing systems that learn to predict a user's response to stimuli, they are typically inadequate when it comes to real world applications. Existing systems are usually trained on data collected in a controlled environment. In these laboratory-like settings, a small number of short experiments are conducted (typically less than an hour long), in which users' responses are measured to a set of pre-selected stimuli, such as pictures, video scenes, or music. One main drawback of the laboratory-collected data is that it is acquired over a short period of time, and the user is typically exposed to one stimulus at a time. However, in reality, a user's reaction to stimuli may vary dramatically depending on the situation the user is in, making the laboratory-collected data not that useful. For example, a user's response while driving in busy traffic might be quite different from the user's response when relaxing at home, even if exposed to virtually the same stimuli in both situations. Furthermore, in short experiments, a user's reaction can only be measured for a small number of stimuli, which is often inadequate for creating an affective computing system for real world applications that may have to model the effect of a wide range of stimuli from multiple sources. For example, a user may be exposed to multiple stimuli coming from digital media, such as video images and sound, while at the same time, the user may be also exposed to physiological sensation stimuli originating from the massage chair the user is sitting in.
Another characteristic of data acquired in real world situations is that it is often incomplete. For instance, while the system may have good information regarding the stimuli the user was exposed to, it might not be able to get an accurate assessment of the user's response. This is especially true, if the user's response is only available under certain conditions, for instance, when the user is facing a camera.
With these many challenges and complications that are part of the real world domain, a system designed to predict a user's response to stimuli in real world scenarios should take into account the added complexity intrinsic to this domain, in order to achieve optimal results.