With the arrival of the distribution of video content over the Internet, advertising is considered by the players of the domain such as Yahoo™, Google™ or Microsoft™ as a key element of growth. Different tools have been developed for this purpose to increase the visual impact of the inserted advertising in the video, while avoiding inconveniencing the spectators.
In particular, Microsoft™ has developed a tool called VideoSense described in the document entitled “VideoSense: a contextual video advertising system”, Proceedings of the 15th international conference on Multimedia, pp 463-464, 2007. This tool was created to insert advertising clips into a video sequence, the objective being to select a clip that is relevant to the video sequence and insert it at key moments in the video, not only at the start and end of the video sequence. To select the clip to insert, low-level parameters of the colour, movement or sound rhythm type are extracted form the clip and the sequence, then compared with each other, the clip selected then being the one having the low-level parameters closest to those of the video sequence. Additional information, such as a title associated with the clip or with the sequence and supplied by the advertisers or the broadcaster of video content or text information contained in the clip or the sequence, are also used to select the clip to insert into the sequence. Once selected, the clip is inserted at particular points of the sequence, and more specifically at points of the sequence for which the discontinuity is high and at which the attractiveness is low, for example at the end of a scene or a shot not comprising any movement.
The selected clip is therefore generally placed after a shot change. Although the video content of the selected clip is related to the content of the sequence in which it is inserted, the impact of this shot change on the perception of the clip by the spectator is neglected. Indeed, a phenomenon observed by several studies, particularly in the document entitled “Predicting visual fixations on video based on low-level visual features” by O. Le Meur, P. Le Callet and D. Barba, Vision Research, Vol. 47/19 pp 2483-2498, September 2007, on the temporal extension of the fixated zone after a shot change is not taken into account. The result of these studies is that the spectator continues to fixate, for an approximate time of 200 to 300 ms after the shot change, the area that he was fixating before the shot change. Hence, the area looked at by the spectator depends, not on the pictures displayed at the current time, but on pictures displayed previously. This phenomenon is illustrated by FIG. 1. The line of pictures in the upper part of the figure represented by a video sequence comprising 7 pictures separated from each other by a time interval of 100 ms. A shot change occurs between the third and fourth picture of the sequence. The line of pictures in the is lower part of the figure shows, by white dots, the picture areas fixated by the spectator. It is noted that the spectator only shifts his fixation at the end of the sixth picture, namely 2 pictures after the shot change. This temporal extension is due to different factors, particularly to the temporal masking, to the surprise effect and to the time biologically necessary to reinitialise the action of perception. In the case of a 50 Hz video, this temporal extension lasts for about 15 pictures after the shot change.
If the interesting regions of the advertising are not positioned at the same points as those of the video sequence before the shot change, the content of the advertising is therefore not immediately perceived by the spectator and the visual impact of the advertising on the spectator is therefore reduced. There is no direct perception of the message carried by the advertising.