The proliferation of data in the information age is only an advantage if the information delivered is accessible and reliable. As such, database technology has advanced considerably in recent years in an effort to keep pace with the exponential growth in the amount of data that must be handled.
An example of database use comes in the field of mobile telecommunications. Mobile devices are increasingly able to perform tasks above and beyond simple telephone calls or short message service (SMS) communications. There is now a bewildering array of mobile devices on the marketplace, each with its own unique set of capabilities. Content providers need to know what these capabilities are if they are to service the full range of devices correctly.
For example, a significant proportion of mobile devices are now able to access the world wide web. Web pages and other web resources may be specifically tailored for mobile devices, such as content coded using wireless markup language (WML) and accessible via wireless application protocol (WAP). However, mobile devices are increasingly capable of handling content generated for wider use (for instance, by personal computers), including content coded using hypertext markup language (HTML), extensible markup language (XML), and extensible hypertext markup language (XHTML).
Information about the capabilities of mobile devices is stored in a database which defines various characteristics of the device. This database will be referred to hereinafter as a “device database”. The characteristics are represented by attributes which take a particular value in dependence on the device. For example, the display capabilities of a device depend on various characteristics, such as the pixel width of the device's screen. This width is described by an integer attribute which takes a specific value for a particular device.
Given the differing capabilities of mobile devices, it is not appropriate to deliver the same web content to all devices. As a result, rather than providing a web resource in exactly the same way to every mobile device that requests it, web servers increasingly adapt the content according to the attributes of the requesting device. To do this effectively, the web server must have easy access to a reliable device database.
For instance, a web server might use a device database to determine the requesting device's screen width and colour depth in order to adjust an image so that it fits and displays well on the device. Once a device is detected, the device database is queried to get the device attributes needed to adapt the web site's content to meet the needs and capabilities of the requesting device.
Clearly, to deliver properly formatted content, the device database needs to be accurate. An error in the database could result in a poor formatting decision which could in turn leave the consumer very dissatisfied and unwilling to return to the site. As of mid 2008, there were over 23,000 different makes and models of GSM handsets and it is therefore impractical for a web site developer to test their site against each device that might access it. As such, the quality of the device database used to deliver content is of the utmost importance.
To this end, a great deal of effort and investment has been put into the fabrication of accurate device databases. However, device databases still fall short of optimum accuracy and there remains a need for more reliable device databases.
In his book, “The Wisdom of Crowds: Why the Many Are Smarter Than the Few and How Collective Wisdom Shapes Business, Economies, Societies and Nations”, James Surowiecki argued that aggregating information from a variety of members of a group can result in a decision which is superior to any that could have been made by a single member of group. A cited example is that of a crowd guessing the weight of an ox; the average result is remarkably accurate, and more accurate than the guess of any one individual, even if that individual is an expert.
This “crowdsourcing” technique is one of the principles behind the information technology movement known as “Web 2.0”. Web 2.0 applications attempt to provide superior services and information by enabling the participation of all (or at least a number of) users. For example, the well-known reference website “Wikipedia.org” relies on the combined expertise of its users to produce an encyclopaedia covering topics that could not be covered by the initial proponents of the system alone.
In order to successfully apply crowdsourcing techniques to improve the accuracy of results, it has been proposed that four criteria must be met:                1. diversity of opinion;        2. independence of members from one another;        3. decentralisation; and        4. a good method for aggregating opinions.        
The first three of these criteria relate essentially to the origin of the raw data; that is to the characteristics of the members of the crowd. The fourth, “a good method for aggregating opinions”, relates to how that data is used to produce the desired result.
Returning to the example of device databases, it is to be recalled that a large number of attempts have been made to produce an adequate data set. This data has been collected in a broad variety of ways (fulfilling criterion 1), and has been undertaken by independent organisations with no central control (fulfilling criteria 2 and 3).
However, no conventional technique provides an adequate method of aggregating the data for this kind of database. A simple average of results for a given variable is undesirable, as it fails to distinguish between data obtained from a reliable source and that obtained from a less reliable source. Moreover, it is inappropriate for Boolean operators that simply take a “True” or “False” value (such as whether the device supports XTML).