Affective computing is a research area involving systems capable of recognizing and analyzing the expression of human emotions. One promising application in this area is personalized selection and generation of streams of stimuli such as media content, in order to elicit a desired affective response from a user. For example, such a system may be used to predict which of several different versions of a commercial is most likely to elicit a positive reaction from a user. While there are some existing systems that learn to predict a user's response to streams of stimuli, they typically take a coarse approach, and evaluate only general properties pertaining to the content as a whole. For example, such systems may look at general properties such as the content's genre, or low-level features derived from the content, such as the sound energy or rate of change between shots. These systems do not look at details pertaining to defined objects in the content and cannot help answer specific questions regarding how certain details can change the user's affective response; for example, questions like “should a man or a woman hold the soda can in a commercial?” or “Should a scene in the backyard portray a child playing with a dog or playing with a ball?”. Answering such questions accurately, requires a system that adopts a comprehensive approach capable of modeling the user's collective response while considering the many details in a stream of stimuli.