Evaluating audio-visual presentations such as television commercials for effectiveness in communicating information to human viewers has been approached in a number of ways. Originally, such evaluations were performed only by individuals who were considered to have specialized knowledge or abilities which enabled them to predict the effectiveness of such presentations. These personalized evaluations have indeed continued and, in fact, many such individuals are employed as staff personnel or independent consultants by advertising agencies, marketing oriented corporations, publishing houses and the like. The shortcomings of such personal evaluations lie in the fact that they are necessarily subjective. Indeed, they are based on artistic interpretations which are highly subject to individual variation.
In attempts to obtain more objective analysis, many advertising agencies, publishing houses and corporations have adopted policies of using committees or panel discussion groups for the evaluations of television commercials and other visual or audio-visual presentations. After viewing a subject presentation separately or in groups, such group members discuss the effectiveness of various aspects or objects contained in the subject presentation. These group techniques are widely acceptable although, here again, their subjectiveness is well recognized and is largely thought to be due to the natural psychological tendency of individuals, whether alone or in groups, to respond in the manner believed to be sought by their interviewer, as opposed to responding in a truly objective manner. Additionally, such group session techniques are extremely expensive.
Most recently, efforts in this area have been directed to the development of truly objective methods of assessing the effectiveness of video presentations. These efforts have been greatly facilitated by the development in the mid 1960's of certain eye monitoring devices capable of tracking the movement of the eye and recording its point of gaze relative to a specific visual stimulus. The most popular and original eye movement monitoring device of this type was the Limbus eye monitor disclosed in U.S. Pat. No. 3,473,868, Young. Numerous improvements to this rudimentary device have been effected and are illustrated in U.S. Pat. Nos. 3,583,793, Newman; 3,594,072, Feather; 3,623,799, Millodot; 3,679,295, Newman; and 3,689,135, Young. Many of these eye monitoring instruments are described in the various publications which have been incorporated by reference herein.
The most advanced eye monitoring instrument presently available employs a technique whereby a beam of light is projected onto the corneal surface of an eye of a viewing individual. A video camera is aligned coaxially with the beam of light so as to monitor the viewing individual's eye pupil. The pupil appears as a bright disk since the camera actually detects the illuminator beam reflected from the retina back through the pupil aperture. The reflection of the light-beam from the surface of the cornea appears as an even brighter spot to the video camera. Thus, the video camera continually monitors the pupil and corneal reflection. The length and direction of the vector from the center of the pupil to the corneal reflection is continuously computed by signal processing instrumentation. These values are in turn used to continuously compute eye line-of-gaze. This technique is used by the ASL Model No. 1998 Eye Movement Monitor System (Applied Science Laboratories, Waltham, Mass.) which is the most advanced eye movement monitor system presently available. For purposes of clarity, this method of eye monitoring will hereinafter be referred to as the "pupil centercorneal reflection" technique.
Despite the commercial availability of sophisticated pupil center-corneal reflection monitoring equipment such as the ASL Model No. 1998, the commercial application of such equipment has largely been limited to the evaluation of static materials such as print advertising and still photographs. In fact, the use of such eye monitoring equipment to evaluate dynamic, video taped presentations such as television commercials has not been widely accepted because by the prior art the cost of processing and analyzing the large amounts of machine readable data generated by such evaluations would have been prohibitive. Indeed, it was not until the invention disclosed in U.S. patent application Ser. No. 848,154, hereinafter referred to as Ser. No. 848,154 and incorporated herein by reference, that the use of eye monitoring equipment in the evaluation of dynamic video tape presentations such as television commercials could be effected reliably and economically.
The invention disclosed in Ser. No. 848,154 relates to a method and apparatus for generating, recording, processing, formating, displaying and evaluating eye monitor generated data obtained from individuals watching video taped presentations. Generally, the inventive concept disclosed in Ser. No. 848,154 involves the use of readily available equipment, in a novel combination, to generate the distribution of actual looking time of individuals viewing a subject presentation (i.e. a television commercial). By the method of Ser. No. 848,154, the distribution of actual looking time is arrived at by first displaying the presentation to an individual while such individual is stationed within a visually neutral room wherein extraneous distractions are minimized or eliminated. An eye monitoring device employing the pupil center-corneal reflection method is used to record the subject individual's point of gaze at intervals of one-thirtieth of a second or faster. Thus, the data so generated consists of a large series of discrete eye position points. These data points encompass periods of eye "fixation" as well as intervals of eye movement termed "saccades". It is, however, only during periods of eye fixation that the viewing individual will perceive and assimilate information. Accordingly, the method of Ser. No. 848,154 separates "fixation" data from less informative "saccadic" data. The method records fixation point parameters such as: starting time, duration and X, Y coordinates of each such visual fixation. The X, Y coordinates so defined are, of course, in spacial relationship to the video presentation shown the subject individual. These fixation parameters are then recorded in a storage file. By recording only the fixation parameters and discarding the less valuable data relating to periods of visual saccades, the amount of data to be stored is minimized and, thus, the method becomes more commercially feasible.
Having obtained these visual fixation data, the visual presentation is then edited into a series of individual scenes, each scene having a real time duration. Specific areas of interest within each scene are then chosen and their boundaries defined by specific X, Y coordinates. Of course, the length of each scene is determined by the number of frames in which the various areas of interest remain substantially unchanged in their X, Y coordinates or contents. Alternatively, or additionally, the scene duration may be defined by the beginning and/or completion of an overlying audio signal.
By the method of Ser. No. 848,154, the starting time, duration and spacial boundaries of each area of interest contained in a given scene are then recorded on a data file. Each scene data storage file is then compared to the corresponding fixation point data file to produce a third data file in which the visual fixation data obtained from each test individual is correlated with the specific areas of interest contained in each scene. After repeating this process for all the individuals so subjected to this testing procedure, a group data file is produced. From these group data a mean distribution of looking time is generated for each area of interest within each scene. Such group data files are felt to contain truly objective information indicating the actual looking behavior of the subject individuals.
The method of Ser. No. 848,154, as summarized above, is limited to specific scenes, each of which contain defined areas of interest. Thus, the method of Ser. No. 848,154 necessitates a significant amount of film or video tape handling in order to delineate the selected areas of interest and to segment the presentation into scenes of specific duration.
In addition, the data display and method of Ser. No. 848,154 relates only to the predesignated areas of interest. If, retrospectively, one desires to sub-divide a previously defined area of interest in order to differentiate between objects contained within such area, reprocessing of the data is necessary. Thus, even by the method disclosed in Ser. No. 848,154, the evaluation of dynamic video presentations remains somewhat expensive, cumbersome and time consuming. This drawback is unfortunate in view of the highly objective nature of such eye movement data.