Conventionally, there are many situations in which it is desired to grasp things such as with how much interest an audience looks at or listens to a lecturer of a lecture meeting, a teacher at a university, an instructor at a cram school, a comedian, and the like. For example, a case where a lecturer of a lecture meeting desires to know to what degree he/she could attract attention of an audience in a lecture, a case where, when evaluating an instructor at a cram school, it is desired to use how many students listened to a talk of the instructor with interest as an evaluation material, and the like are assumed. In addition, there are cases where, not only when evaluating people but also when evaluating a content displayed on a television or a digital signage, it is desired to grasp how many people in the audience listened to or looked at the content with interest.
As a technology concerning such an evaluation, Patent Literature 1, for example, describes a system of calculating the degree of satisfaction on the basis of at least one of an evaluation value of the degree of attention based on a line of sight of a viewing user who views a content and an evaluation value of an expression at that time (smile level) and evaluating the content to play back only a scene with a high degree of satisfaction or make a content recommendation.
Moreover, Patent Literature 2 describes a system capable of shooting video of a stage or the like and also shooting expressions of visitors, and storing the number of smiley men and women as an index every 30 seconds to extract only images in a time zone that women made high evaluations and play back a digest.
Furthermore, Patent Literature 3 describes a system of inputting a favorite degree of a user at predetermined time intervals of a moving image being recorded and extracting, from the moving image, portions with a high favorite degree and portions before and after a time point when a smile of a person detected from the moving image is included for playback.