The present invention relates generally to the Internet and electronic commerce, or “e-commerce”. More particularly, the present invention relates to a system and method for the transformation and canonicalization of semantically structured data.
The Internet has developed into a medium by which a person using a computer is connected to the Internet can access voluminous amounts of information. The ability to access information via the Internet can be provided in a variety of different ways. Sometimes information is provided by Internet search engines, which typically search the Internet for key words or phrases and then provide a list of web sites which include the search words or phrases in the web page, such as, its text or embedded identifiers (e.g., metatags). Information is also accessible via the Internet by individual web sites. Individual web sites provide a wide variety of information and services which are both time-critical and not time dependent.
The Internet is especially conducive to conducting electronic commerce. Many Internet servers have been developed through which vendors can advertise and sell their products or services. Such products or services may include items (e.g. music) that are delivered electronically to the purchaser over the Internet and items (e.g., books) that are delivered through conventional distribution channels (e.g., a common carrier). The services can include providing information (e.g., weather, traffic, movies, cost comparisons) that is available over the Internet and transactions (e.g., stock trading, restaurant reservations) that are carried out over the Internet.
Unfortunately, while the Internet provides users with the potential to access a tremendous amount of information, finding useful Internet-based information is often time-consuming and cumbersome. Further, it is difficult to find and compare the same information available at multiple individual web sites because the same information can be organized in many different ways, described in many different forms, and changed at many different times. Added to these inherent difficulties with the Internet is the simple fact that a person cannot access the information available on the Internet without having a computer or other such electronic device which is connected to the Internet via an Internet Service Provider (ISP). Furthermore, to effectively find desired Internet-based information, a person must learn how to locate information via the Internet. As such, persons without computers, people without connections to ISPs, and people without experience or training on use of the Internet are limited from access to Internet-based information. These factors contribute to reasons why industry experts estimate that by the end of 1999, only 30% of the United States population has ever accessed the Internet, or “surfed the web.” (Statistics from Forrester Research, October 1999).
Hence, it is desirable to provide a system and method by which people can access Internet-based information without directly using a computer, having a personal ISP connection, or gaining experience or training on use of the Internet. In addition, it is desirable to provide a system and method which allows people to obtain Internet-based information using convenient and readily available means, such as, by way of voice over a public telephone. Further, it is desirable to provide a system and method which transforms and canonicalizes semantically structured data such that data can be transposed to and from Internet sources and user interface platforms, such as, voice.
Many challenges have heretofore made such a system and method impossible. For example, people using such a system and method would want to have the information quickly or, at least, within some tolerable amount of time. Such speed is difficult. Even with conventionally high speed computers and fast communication connections, the delay required to access the Internet has made many people call it the “world wide wait.”Another challenge to such a system and method is the recognition of voice communications. Conventional voice recognition technology is slow and inaccurate. Convenient and meaningful access to Internet-based information by voice would require simple, quick, and accurate voice recognition. Nevertheless, known processors and memory devices do not allow quick access to the large vocabularies and processing speeds which would be necessary for voice recognition as done in human-to-human interaction.
Yet another challenge to such a system and method is how to provide free access to Internet-based information while financially supporting the service. Conventional advertising on the Internet requires the ability to see advertising information, such as “banners”, and make some manual selection, such as “clicking” the banner, to get more information on the advertised product or service.
Therefore, in addition to the above-mentioned capabilities, it is desirable to provide a system and method by which people can gain quick and accurate voice access to Internet-based information free of charge. It is further desirable to provide a system and method by which data can be taken from Internet sources and compared with other data from other Internet sources and then provided to users of a variety of platforms, including speech and wireless access protocol (WAP).