In general, an image evokes emotion in a viewer differently, based not only on the content of the image but also upon the individual experiences of the viewer. For example, an image of a certain kind of food, like a hamburger, may elate some fast food lovers, while others may be irritated due to health reasons.
In computer vision, efforts are being made toward categorizing images via abstract concepts like affective image classification and aesthetic quality estimation. A variety of image sources, for example Internet images and abstract paintings and artistic pictures, are studied in recent literature regarding affective image classification in computer vision. Several efforts are directed toward affective image classification on both realistic and artistic images. These efforts typically describe the emotion elicited by a particular image using hard labels, that is, a fixed label describing the emotion content of the image.
Conventional approaches treat different emotion categories independently in a 1-vs-all setting of multi-class classification. This is despite the inconsistency of this approach with the notion that some emotion categories are closely related. For example, joy and sadness have strong negative correlation. Many emotion-related image categorization schemes use image databases, such as emodb (M. Solli and R. Lenz, “Emotion related structures in large image databases,” in International Conference on Image and Video Retrieval. ACM, 2010, pp. 398-405.), GAPED (E. S. Dan-Glauser and K. R. Scherer, “The geneva affective picture database (GAPED): a new 730-picture database focusing on valence and normative significance.,” Behavior Research Methods, vol. 43, no. 2, pp. 468-477, 2011.), and IAPS (P. J. Lang, M. M. Bradley, and B. N. Cuthbert, “International affective picture system (ZAPS): affective ratings of pictures and instruction manual. technical report a-8.,” 2008.), that suffer from several drawbacks. Firstly, these image databases assign hard labels to images, ignoring the fact that there is not necessarily consensus amongst viewers of an image in terms of emotion experienced. Further, even with a similar kind of emotion experienced, conventional databases do not capture the notion that the degree of emotion may vary (for example, joy vs. ecstasy). Secondly, the emotion categories of these databases are chosen in an ad-hoc way without solid foundation of psychological theories. Thirdly, the number of images in each emotion category is not equal in these databases, resulting in an unbalanced database that may cause bias in image categorization results.