1. Field of Invention
The invention relates generally to the field of computer-assisted data manipulation and analysis. Specifically, in one exemplary aspect, the invention relates to methods and apparatus for collection and classification of data regarding an audience in a content-based network such as a cable television or satellite network.
2. Description of Related Technology
“Nielsen Ratings” are a well known system of evaluating the viewing habits of cross sections of the population. When collecting Nielsen ratings, companies use statistical techniques to develop a sample population which is a cross section of a larger national population. Theoretically, the viewing habits of the sample population will mirror the larger population. The companies then measure the populations viewing habits to identify, among other things, what programs the population is watching as well as the time and frequency at which those programs are watched. This information is then extrapolated to gain insight on the viewing habits of the larger population. Historically, the Nielsen system has been the primary source of audience measurement information in the television industry. The Nielsen system, therefore, affects various aspects of television including inter alia, advertising rates, schedules, viability of particular shows, etc., and has been also recently expanded from measuring an audience of program content to measuring an audience of advertising (i.e., Nielsen ratings may be provided for advertisements themselves).
The Nielson system collects data regarding audiences via either (i) by asking viewers of various demographics to keep a written record of the television shows they watch throughout the day and evening, or (ii) by using “set meters,” which are small devices connected to televisions in selected homes which electronically gather the viewing habits of the home and transmit the information nightly to Nielsen or a proxy entity over a connected phone line or other connection.
There are several disadvantages to the Nielsen approach. First, the sample of viewers selected may not be fairly representative of the population of viewers (or the subset of cable viewers) as a whole. For example, in a cable network comprising four million cable viewers, a sample of any 100,000 viewers may exhibit different average viewing habits than the averages associated with the other 3,900,000 cable viewers who are not in the sample.
Second, static delivery makes it difficult to precisely target an audience that is known to be in the market. For example, suppose that the ideal target for a sports car advertisement is the set of all consumers who like and would be interested in buying sports cars. If all that is known from Nielsen data is that 10% of the sample group has watched the auto-racing channel for over three hours in the last month, this may not perfectly correlate with set of consumers who like sports cars. This may be the case, for example, if there are some consumers who are in the market for sports cars but who never watch the auto racing channel, or if there are some viewers of the auto racing channel who have no interest in buying or owning sports cars. As such, patterns based on viewership data often imprecisely identify the desired audience.
Furthermore, the Nielson system is disadvantageously program-specific. Program-specific audience data collection is problematic from the standpoint that this program-coupled approach is only as good as the underlying demographic correlation model. For example, assuming a demographic of 18-30 year old females typically tune in to American Idol each broadcast (e.g., Monday at 8:00 pm), this same demographic may not have any interest in watching the program immediately preceding or following American Idol, and hence may tune away (or delay tuning to that channel until the start of America Idol).
Another disability of the Nielson approach is that it tends to aggregate data or results for given premises (e.g., households) as opposed to providing data for specific users of that premises. For example, the switching activity associated with a given set top box for a family of five represents switching activity for each member of that family (including perhaps viewing of cartoons for a child, teen-related programs for a teenager, and adult-related content for one or more adults). However, Nielsen systems are at present incapable of determining precisely which member(s) of that household viewed which programs or advertisements. Hence, the data obtained using Nielsen techniques is somewhat of an amalgam of the data for individual users, and various combinations thereof.
For media content providers such as cable and satellite companies and the like, a major issue is how to more accurately target population segments for advertising campaigns based on particular characteristics of an audience, opportunities for insertion (or replacement) of an advertisement, and other factors. It is most desirable for advertisers to have advertisements for products that are targeted to a particular demographic to be viewed by that demographic.
Therefore, there is a need for improved methods and apparatus which do not require or rely solely on population sampling or trend analysis based on a sample population, in order to more accurately generate and analyze audience measurement data. Such improved methods and apparatus would ideally be able to gather audience information in real-time or near-real time with associated viewership actions of actual viewers. Exemplary methods would be able to obtain audience information directly from customer's premises equipment (i.e. set top boxes, cable modems etc.), for each individual box or even on a per-user basis where possible, thereby allowing a content provider to gather specific information in large quantities across a broad geographical area. Ideally, these methods and apparatus would be able to monitor or use data from multiple sources of content to which viewership behavior relates, and also maintain subscriber anonymity or privacy (i.e., no use of personally identifiable information).
These features would also be provided leveraging substantially extant network infrastructure and components, and would be compatible with a number of different client device and delivery systems including both wired and wireless technologies.