One development which is changing the way humans and computers interact is the growing capability of computer systems to detect how their users are feeling. A growing number of computer applications utilize their users' affective response, as expressed via sensor measurements of physiological signals and/or behavioral cues, in order to determine the users' emotional responses. This capability enables such applications, often called affective computing applications, to determine how users feel towards various aspects of interactions between the users and computer systems.
Users often spend a lot of time interacting with computer systems through various platforms (e.g., desktops, laptops, tablets, smart phones). Through numerous interactions with computer users may be exposed to a wide array of content: communications with other users (e.g., video conversations and instant messages); communications with a computer (e.g., interaction with a user's virtual agent), and/or various forms of digital media (e.g., internet sites, television shows, movies, and/or interactive computer games). One environment in which users both spend large amounts of time and are exposed to large quantities of digital content is social networks of various types (e.g., Facebook™, Google+™, Twitter™, Instagram™, Reddit™). Such networks provide users with diverse content, often selected for the user and/or originating from people who know the user; hence, the attraction many users have to social network sites.
Throughout the many interactions a user may have, affective computing systems can measure the user's affective response to content the users consume, and analyze the information. This can enable the affective computing applications to improve the user experience, for example by selecting and/or customizing content according to the user's liking.
A factor upon which the success of affective computing applications often hinges, is the availability of models that can be used to accurately translate affective response measurements into corresponding emotional responses. Creating such models often requires the collection of ample training data, which includes samples taken from similar situations to ones with which the models are to be used in. For example, the samples for training may include a pair of (i) affective response measurements obtained by measuring a user with sensor, and (ii) labels describing the emotional response of the user while the measurements were taken. The samples may then be utilized, for instance by machine learning methods, in order to create a model for predicting emotional response from affective response measurements.
A general principle often observed when creating models from data is that the more data available, the more accurate the models crated from it become. In addition, the more similar the training data is to the instances on which the model is to be used, the more accurate the predictions using the model are likely to be.
For example, a model that predicts emotional response from facial expression is trained on a set of samples that include images of faces coupled with a label describing the emotion expressed in the images. Generally, such a model would be better if trained on a large set of samples that capture a diverse set of faces and emotions, rather than a significantly smaller set of samples that capture a much limited set of faces and emotions. In addition, if used to predict emotions from a face of a specific user, the model would be better if at least some of the training samples involved the specific user; this could help the model account for unique characteristics such as the shape of the user's face, specific mannerisms, and/or types of facial expressions. Furthermore, if used primarily in day-to-day situations, the model would probably perform better if trained on samples acquired spontaneously during the day-to-day situations rather than samples acquired in a controlled non-spontaneous manner. The conditions of the spontaneous day-to-day samples (e.g., environment, background, lighting, angle of view) might be significantly different from settings in controlled sample acquisition (e.g., when the user sits down and is prompted to express emotions). In addition, the expressions of emotion in day-to-day scenarios are likely to be more genuine than the ones expressed in controlled sample acquisition. For instance, facial expressions (such as micro expressions) are difficult to create accurately on demand; and thus, may not be the same as the spontaneous expressions.
One of the problems with collecting samples of affective response corresponding to spontaneous expressions of emotions in day-to-day scenarios is that it is often difficult to collect such data. It is not always possible to know when the spontaneous expression of emotion are likely to occur, and when expressions of emotion occur, it is not always possible to tell what the circumstances are and/or what type of emotion is expressed. Thus, collecting such training data may require a certain amount of manual curation in order to determine what type of emotion is expressed, if any. This may become impractical on a large scale, such as when collecting a large body of training data for a user and/or collecting data for a large group of users. In addition, it may inconvenience users, for instance, if they need to be actively involved in selecting the samples by providing appropriate labels to the samples. Collecting such samples may also violate user privacy; for example, if other entities get to examine the measurements and/or aspects of the users' day-to-day lives in order to generate the required training samples.
These aforementioned limitations emphasize a need to create an automated method for collecting samples that include measurements of a user's affective responses coupled with labels describing the user's corresponding emotional response.