In the light of high penetration of Internet use and the rapid growth of the on-line industry, there has become a need for an accurate and independent Internet site rating service. Such a service should provide on-line industry users and organisations and other interested parties with a precise vehicle with which to assess vital Internet site traffic dynamics. For example, it would be advantageous for such users and organisations to have an accurate picture of the information that Internet users were viewing on and interacting with particular websites, as well as the range of sites that target markets were visiting, the advertisements being viewed and how particular sites compared statistically with competitor sites. This type of commercial information is invaluable to those in the on-line industry wishing to properly target their markets and also focus their on-line presence.
Furthermore, to date there has been no product or service for the on-line industry users and organisations that provides a total market rating system that uses site centric measurements, such as proxy and server log files, browser based measurements, and user centric measurements, such as panel data and sample survey data. Furthermore, site and user centric measurements have not been used to collect data relating statistics pertaining to, for example, a website that has no site centric measurement data available. By providing the sites with such information it provides a more accurate picture about the Internet population and which sites the population use or visit regardless of whether the site centric measurements are available or not for a particular site.
A syndicated multi media marketing data base has been used in Australia which integrates consumer demographics, product usage and media consumption for value-added marketing and media solutions. The data base enables advertising planners, buyers and users to target their advertising campaigns and to plan and evaluate integrated media campaigns based on the only official buying and selling currencies for mainstream Australian media. The data base utilises the strengths of the media industries most widely used research tools such as TV ratings data, radio ratings data, readership surveys and service usage questionnaires. Each reporting period the operator of this data base uses a combination of data to integrate TV viewing data, updated each period, at the program level into a respondent single source data set which may comprise up to say 40,000 respondents. This method is used as a more integrated method of producing data sets capable of cross-referencing television with other media and consumption variables. This approach allows viewing information from the audited television ratings to be analysed against usage, consumption and other media information. The television data base is refreshed periodically so that the most current television program data is available—with ratings consistent with the operator of the data base.
The abovementioned system does not allow the “fusion” of one data source created from measuring interactions of a sample of users in relation to their use of the resources, for example use of the internet, and a further source of data pertaining to interactions provided by all users of the resource, measured from for example a website, or viewers of a program measured by a television station to obtain accurate estimates of traffic densities at for example a particular website or television program where the particular website or television station does not have the further source available.
Known measurement techniques include that of a server log file analysis. In this method a log file is kept on the server of all record files requested, IP addresses of those visiting the site as well as successful downloading of all resources delivered from the site server. This method, however is not necessarily an accurate indication of resources used and/or viewed on the site, due to the method not being able to account for resources that are subsequently stored in proxy server caches or browser caches and are re-viewed. For example popular web pages may be stored on various Internet Service Providers (ISPs) proxy servers around the world, so that the ISPs do not need to directly access a popular site every time a user requests access to that site. The ISP simply provides access to their stored version of the site. This enables the ISPs to provide a more efficient service, but results in a less accurate measurement service due to the inability to monitor caches.
Similarly, once a site is accessed, site resources are saved in the user's browser cache, while in use. While the server log file analysis may have recorded data relating to the accessed resources at the time they were accessed, if the user then returns to one or more pages, such as by hitting the “back” button on their browser, then the resource being returned to is typically accessed from their browser cache, so that once again this page request is not recorded by the server log file.
Another method used by some organisations is the so-called browser based measurement approach. In this method, software monitors site resources as they are viewed within a browser. This software monitors the user's actions when accessing the Internet. While this approach does not suffer the accuracy problems of server log file analysis, a problem that does exist with this approach is that for a complete market analysis all sites need to be g to agree to install the measurement code on every site page. In practice, it has proven quite difficult to obtain cooperation with all sites.
In another method, also used by some organisations, Internet users are recruited and their individual usage of the Internet is monitored to be used in statistical analysis. Usage is monitored by installing hardware and/or software on the user's computer. This hardware or software is not transparent for the user and is often quite onerous, requiring the user to log the software on each time they use it.
An example of this method is provided in U.S. Pat. No. 5,675,510, where personal computer use is measured through the use of a hardware box physically located on the user's computer. This hardware records log files of Internet access by the user. This process is expensive due to the hardware costs, installation costs and maintenance and support costs. Furthermore, the process is quite obtrusive, as the users are very conscious of the tracking as they see the box every time they use their PC. Furthermore, the process does not track access of monitored users where for example, a monitored user accesses the internet at a location other than at the user's home or work. Examples of location that are not monitored are cyber cafés, educational facilities, friend's homes etc.
There is considered to be a need for an alternative measurement approach that provides accurate results and also has improved transparency for the user.