Tim Berners-Lee, the inventor of the World Wide Web, coined the phrase the “Semantic Web” in the late 1990s, to describe “a web of data that can be processed directly and indirectly by machines.” He believed this was the next step in the evolution of the World Wide Web from its current existence as a “web of documents” that cannot be processed directly by machines. However, the transition from the current World Wide Web to Semantic Web is not guaranteed and depends on many major development efforts—both individual and coordinated among multiple entities.
The first is the creation and availability of metadata about the information contained in documents and datasets that reside in public and private locations on the internet. Markup languages that are currently in use, like hypertext markup language (html), only recognize the existence of data but not the meaning behind the data. However, if data were identified through accompanying metadata as being, for example, calendar data, financial data, or other specific types of data, then machines could recognize the categorized data as such and could then perform tasks and queries that currently require human intelligence and action.
The second requirement is the adoption of open formats and standards for marking up metadata. The vision of the Semantic Web will only be achieved if users create metadata with the same open standards, so that machines can communicate with one another easily. The third is the creation of structured hierarchies of semantic information (ontologies), as well as the publishing of large sets of data, made available in publicly accessible, and queryable, repositories.
If all of this were to take place on a mass scale throughout the internet, software programs could analyze and combine information, generate meaningful analysis, and perform many tasks that currently require human intelligence and effort. For example, if all golf courses were to make their tee-time schedules available publicly in a common, open calendar standard, then a user could perform a single query that searched the calendars of all nearby courses for available tee times, and cross-check this information with the user's personal calendar, and automatically make a reservation.
There have been some pockets of limited progress in these regards. Some national and local governments are making certain repositories of data available in usable formats. There is also some progress in the private sector—for example, XBRL is a markup language that has been widely adopted to enhance financial data.
However, for the most part, from the perspective of individual users, there has been little progress toward creating a semantic-enabled, intelligent experience of the type described in the example above. Given the slow pace of evolution, this is likely to be many years away.