1. Field of the Invention
The present invention is a system and method for designing a comprehensive media audience measurement platform that can estimate the audience of a media of interest only with the measurements of a subset of the actual audience sampled from a limited space in the site, which includes a data sampling planning and a data extrapolation method with a site, display, and crowd characterization.
2. Background of the Invention
The role of digital media for advertisement in public spaces is becoming increasingly important. The task of measuring the degree of media exposure is also deemed as very important both as a guide to the equipment installation (equipment kind, position, size, and orientation) and as a rating for the content programming. As the number of such displays is growing, measuring the viewing behavior of the audience using human intervention can be very costly.
Unlike the traditional broadcast media, the viewing typically occurs in public spaces where a very large number of unknown people can assemble to comprise an audience. It is therefore hard to take surveys of the audience using traditional interviewing through telephone or mail/email methods, and on-site interviews can be both very costly and potentially highly biased.
There are technologies to perform automated measurement of viewing behavior; the viewing behavior in this context is called ‘viewership’. These automatic viewership measurement systems can be maintained with little cost once installed, and can provide a continuous stream of viewership measurement data. These systems typically employ electro-optical visual sensor devices, such as video cameras or infrared cameras, and provide consistent sampling of the viewing behavior based on visual observations. However, due to the high initial installation cost, the sensor placement planning is extremely important. While the equipment delivers consistent viewership measurements, they have limitations of measuring the view from their individual fixed positions (and orientations, in most cases). However, relocating the equipment can affect the integrity of the data.
In a typical media display scenario, it is unrealistic, if not impossible, to detect and record all instances of viewership occurring in the site using these sensors. Any optical sensor has a limited field of coverage, and its area of coverage can also depend on its position and orientation. A large venue can be covered by multiple sensors, and their individual lens focal lengths, positions, and orientations need to be determined.
The data delivered from these sensors also needs to be properly interpreted, because the viewership data that each equipment provides has been spatially sampled from the whole viewership at the site. The ultimate goal of the audience measurement system is to estimate the site-wide viewership for the display; it is crucial to extrapolate the site-wide viewership data from the sampled viewership data in a mathematically sound way.
The present invention provides a comprehensive solution to the problem of automatic media measurement, from the problem of sensor placement for effective sampling to the method of extrapolating spatially sampled data.
The prior attempts for measuring the degree of public exposure for the media, including broadcast media or publicly displayed media can be found in the following disclosures:
U.S. Pat. No. 4,858,000 of Lu, et al. (hereinafter Lu U.S. Pat. No. 4,858,000) and U.S. Pat. No. 5,771,307 of Lu, et al. (hereinafter Lu U.S. Pat. No. 5,771,307) introduce systems for measuring viewing behavior of broadcast media by identifying viewers from a predetermined set of viewers, based primarily on facial recognition. U.S. patent application Ser. No. 11/818,554 of Sharma, et al. (hereinafter Sharma) introduces a method to measure viewership of displayed objects using computer vision algorithms. The present invention also aims to measure viewing behavior of an audience using visual information, however, has focus on publicly displayed media for a general unknown audience. The present invention utilizes an automated method similar to Sharma to measure the audience viewership by processing data from visual sensors. The present invention provides not only a method of measuring viewership, but also a solution to a much broader class of problems including the data sampling plan and the data extrapolation method based on the site, display, and crowd analysis, to design an end-to-end comprehensive audience measurement system.
All of the systems presented in U.S. Pat. No. 6,958,710 of Zhang, et al. (hereinafter Zhang), and U.S. Pat. No. 7,176,834 of Percy, et al. (hereinafter Percy) involve portable hardware and a central communication/storage device for tracking audience and transmitting/storing the measured data. They rely on a predetermined number of survey participants to carry the devices so that their behavior can be measured based on the proximity of the devices to the displayed media. The present invention can measure a very large number of audience behaviors without relying on recruited participants or carry-on devices. It can accurately detect not only the proximity of the audience to the media display, but the actual measurement of the viewing time and duration based on the facial images of the audience. These prior inventions can only collect limited measurement data sampled from a small number of participants; however, the present invention provides a scheme to extrapolate the measurement data sampled from the camera views, so that the whole site-wide viewership data can be estimated.
All of the systems presented in Zhang and Percy involve portable hardware and a central communication/storage device for tracking audience and transmitting/storing the measured data. They rely on a predetermined number of survey participants to carry the devices so that their behavior can be measured based on the proximity of the devices to the displayed media. The present invention can measure a very large number of audience behaviors without relying on recruited participants or carry-on devices. It can accurately detect not only the proximity of the audience to the media display, but the actual measurement of the viewing time and duration based on the facial images of the audience. These prior inventions can only collect limited measurement data sampled from a small number of participants; however, the present invention provides a scheme to extrapolate the measurement data sampled from the camera views, so that the whole site-wide viewership data can be estimated.
There have been attempts for designing a camera platform or a placement method for the purpose of monitoring a designated area, such as U.S. Pat. No. 3,935,380 of Coutta, et al. (hereinafter Coutta), U.S. Pat. No. 6,437,819 of Loveland, et al. (hereinafter Loveland), U.S. Pat. No. 6,879,338 of Hashimoto, et al. (hereinafter Hashimoto U.S. Pat. No. 6,879,338), and U.S. Pat. Pub. No. 20100259539 of Papanikolopoulos, et al. (hereinafter Papanikolopoulos).
Coutta presented a method to place multiple cameras for monitoring a retail environment, especially the cash register area. Because the area to be monitored is highly constrained, the method doesn't need a sophisticated methodology to optimize the camera coverage as the present invention aims to provide.
Loveland presents a pan/tilt/zoom camera system for the purpose of monitoring an area and tracking people one by one, while the present invention aims to find an optimal placement of cameras so that the cameras have maximal concurrent coverage of the area and of multiple people at the same time.
Hashimoto employs multiple outward facing cameras to have a full coverage of the surroundings, while the present invention provides a methodology to place cameras to have maximal coverage given the constraints of the number of cameras and the constraints of the site, display, and the measurement algorithm.
Papanikolopoulos presents a method for placing cameras in optimal positions to maximize the observability of motion paths or activities of the crowd taking into account the obstacles in a site. Although it places cameras in optimal positions, its optimality is only for the crowd observability in the site, while the optimality of the present invention is for the observability of the viewers for a specific target display.
There have also been prior attempts for counting or monitoring people in a designated area by automated means.
U.S. Pat. No. 5,866,887 of Hashimoto, et al. (hereinafter Hashimoto U.S. Pat. No. 5,866,887) and U.S. Pat. Pub. No. 20060171570 of Brendley, et al. (hereinafter Brendley) use special sensors (distance measuring and pressure mat sensors, respectively) placed in a designated space, so that they can count the number of people passing and, in the case of Brendley, classify the kind of traffic, whereas in the present invention the visual sensor based technology can measure not only the amount of traffic, but also the direction of the traffic, and on wider areas.
U.S. Pat. Pub. No. 20070032242 of Goodman, et al. (hereinafter Goodman) introduces using the tracking of active wireless devices, such as mobile phones or PDAs, so that the people carrying these devices can be detected and tracked. The crowd estimation method of the present invention can measure the crowd traffic without any requirement of the people carrying certain devices, and without introducing potential bias toward business people or bias against seniors or children.
U.S. Pat. No. 6,987,885 of Gonzalez-Banos, et al. (hereinafter Gonzalez-Banos), U.S. Pat. No. 6,697,104 of Yakobi, et al. (hereinafter Yakobi), and U.S. Pat. No. 7,203,338 of Ramaswamy, et al. (hereinafter Ramaswamy) detect and count the number of people in a scene by processing video frames to detect people. One of the exemplary embodiments of the present invention utilizes top-down view cameras so that person detection and tracking can be carried out effectively, where an individual person in the crowd is being tracked so that both the crowd density and direction can be estimated. These prior inventions do not concern the crowd directions.
U.S. Pat. No. 7,139,409 of Paragios, et al. (hereinafter Paragios) measures the pattern of crowd motion without explicitly detecting or tracking people. One of the exemplary embodiments of the present invention also makes use of such crowd dynamics estimation, however, it is a part of the comprehensive system where the goal is to extrapolate the sampled viewership measurement based on the crowd dynamics.
U.S. Pat. Pub. No. 20080004953 of Ma, et al. (hereinafter Ma) characterizes the audience of public display or advertising system using sensors and builds an audience distribution model. The audience distribution model is utilized to deliver contents to matched audience. Although it analyzes and characterizes the audience of the target display, it is not able to estimate the site-wide audience since it only measures the audience only within the sensing range of the sensors employed. The present invention, however, estimates the site-wide viewership of a target display by the viewership extrapolation.
There have been prior attempts for learning a general mapping based on available training data, such as U.S. Pat. No. 5,682,465 of Kil, et al. (hereinafter Kil) and U.S. Pat. No. 5,950,146 of Vapnik, et al. (hereinafter Vapnik).
The present invention makes use of a statistical learning method similar to Kil or Vapnik, where the input-output relation between a large number of data can be used to learn a regression function. In the present invention, the regression function is used to compute the viewership extrapolation mapping.
In summary, the present invention aims to measure the media viewership using automated and unobtrusive means that employ computer vision algorithms, which is a significant departure from methods using devices that need to be carried by a potential audience. It also provides comprehensive solutions to the sensor placement issue and data extrapolation issue, based on the site and display analysis; these features also contrast with the inventions just introducing measurement algorithms. There have been prior inventions that address the problem of sensor placement for the purpose of monitoring people's behavior, but the present invention provides an optimal solution to the issue based on the analysis of the site and the display. The present invention employs statistical machine learning approaches, similarly to some of the prior inventions, to extrapolate the sampled viewership data to estimate the site-wide viewership data; the method of the present invention utilizes the learning approach to achieve time-dependent extrapolation of the viewership data based on the insights from the crowd and viewership analysis.