This invention relates generally to systems for creating, maintaining and using database information. More particularly, it relates to a system for automatically creating and maintaining a database of information utilizing user opinions. Even more particularly, it relates to an Internet system assisting a population of users to automatically maintain the database content and to improve the usefulness and quality of the database information without any substantial management by the website owner-manager.
Recently, a wide range of interactive devices has been developed to provide information to consumers via communications networks. These interactive devices include, for example, computers connected to various computer on-line services, interactive kiosks, interactive television systems and the like. In particular, the popularity of computer on-line services has grown immensely in popularity over the last decade. Computer on-line services are provided by a wide variety of different companies. In general, most computer on-line services are accessed via the Internet. The Internet is a global network of computers. One popular part of the Internet is the World Wide Web, or the “Web.” The World Wide Web contains computers that display graphical and textual information. Computers that provide information on the World Wide Web are typically called “Web sites.” A Web site is defined by an Internet address that has an associated electronic page, often called a “home page.” Generally, a home page is an electronic document that organizes the presentation of text, graphical images, audio and video into a desired display. These Web sites are operated by a wide variety of entities, which are typically called “providers.”
A user may access the Internet via a dedicated high-speed line or by using a personal computer (PC) equipped with a conventional modem. Special interface software, called “browser” software, is installed within the PC. When the user wishes to access the Internet by normal telephone line, an attached modem is automatically instructed to dial the telephone number associated with the local Internet host server. The user can then access information at any address accessible over the Internet. Two well-known web browsers, for example, are the Netscape Navigator browser marketed by Netscape Communications Corporation and the Internet Explorer browser marketed by Microsoft Corporation.
Information exchanged over the Internet is typically encoded in HyperText Mark-up Language (HTML) format. The HTML format is a scripting language that is used to generate the home pages for different content providers. In this setting, a content provider is an individual or company that places information (content) on the Internet so that others can access it. As is well known in the art, the HTML format is a set of conventions for marking different portions of a document so that each portion appears in a distinctive format. For example, the HTML format identifies or “tags” portions of a document to identify different categories of text (e.g., the title, header, body text, etc.). When a web browser accesses an HTML document, the web browser reads the embedded tags in the document so it appears formatted in the specified manner.
An HTML document can also include hyperlinks, which allow a user to move from one document to another document on the Internet. A hyperlink is an underlined or otherwise emphasized portion of text that, when selected using an input device such as a mouse, activates a software connection module which allows the user to jump between documents or pages (i.e., within the same Web site or to other Web sites). Hyperlinks are well known in the art, and have been sometimes referred to as anchors. The act of selecting the hyperlink is often referred to as “clicking on” the hyperlink.
Glossary of General Terms and Acronyms
The following terms and acronyms explained below as background and are used throughout the detailed description:
Client-Server. A model of interaction in a distributed system in which a program at one site sends a request to a program at another site and waits for a response. The requesting program is called the “client,” and the program which responds to the request is called the “server.” In the context of the World Wide Web, the client is typically a “Web browser” which runs on a user's computer; the program which responds to Web browser requests at a Web site is commonly referred to as a “Web server.”
Domain Name System (DNS). An Internet service that translates domain names (which are alphabetic identifiers) into IP addresses (which are numeric identifiers for machines on a TCP/IP network).
Internet Information Server (IIS). Microsoft Corporation's Web server that runs on Windows NT platforms.
Internet. A collection of interconnected (public and/or private) networks that are linked together by a set of standard protocols to form a distributed network. While this term is intended to refer to what is now commonly known as the Internet, it is also intended to encompass variations which may be made in the future, including changes and additions to existing standard protocols.
HyperText Markup Language (HTML). A standard coding convention and set of codes for attaching presentation and linking attributes to informational content within documents. During a document authoring stage, the HTML codes (referred to as “tags”) are embedded within the informational content of the document. When the Web document (or “HTML document”) is subsequently transferred from a Web server to a Web browser, the codes are interpreted by the Web browser and used to parse and display the document. In addition to specifying how the Web browser is to display the document, HTML tags can be used to create links to other websites and other Web documents (commonly referred to as “hyperlinks”). For more information on HTML, see Ian S. Graham, The HTML Source Book, John Wiley and Sons, Inc., 1995 (ISBN 0471-11894-4).
HyperText Transport Protocol (HTTP). The standard World Wide Web client-server protocol used for the exchange of information (such as HTML documents, and client requests for such documents) between a Web browser and a Web server. HTTP includes a number of different types of messages that can be sent from the client to the server to request different types of server actions. For example, a “GET” message, which has the format GET, causes the server to return the document or file located at the specified Universal Resource Locator (URL).
Java. A general purpose programming language developed by Sun Microsystems. Java has a number of features that make the language well-suited for use on the World Wide Web. Small Java applications are called Java applets and can be downloaded from a Web server and run on a personal computer by a Java-compatible Web browser, such as Netscape Navigator or Microsoft Explorer.
Java servlet. A small Java-based program designed to perform a specific task within a Web server environment. Java servlets are analogous to Java applets except the are designed to only run on the Web server.
Java Virtual Machine. A set of applications that create a run time environment for executing Java code.
JRun. A server-side extension that allows a Web server to execute Java servlets for the processing and display of information. JRun is a widely adopted engine for developing and deploying server-side Java applications that use Java Servlets and JavaServer Pages (JSP).
Java Database Connectivity (JDBC). A Java API developed by JavaSoft, a subsidiary of Sun Microsystems of Mountain View, Calif. JDBC enables Java programs to execute SQL statements, which allows Java programs to interact with any SQL-compliant database. Since many relational database management systems (DBMSs) support SQL, and because Java itself runs on most platforms, JDBC makes it possible to write a single database application that can run on different platforms and interact with different database management systems. JDBC is similar to ODBC but is designed specifically for Java programs, whereas ODBC is language-independent.
Open DataBase Connectivity (ODBC). A database access method developed by Microsoft Corporation. ODBC allows an application to access data from a database by translating the application's data queries into commands that the database management system (DBMS) can understand.
Transmission Control Protocol/Internet Protocol (TCP/IP). A standard Internet protocol (or set of protocols) which specifies how two computers exchange data over the Internet. TCP/IP handles issues such as packetization, packet addressing, handshaking and error correction. For more information on TCP/IP, see Volumes I, II and III of Corner and Stevens, Internetworking with TCP/IP, Prentice Hall, Inc., ISBNs 0-13-468505-9 (vol. I), 0-13-125527-4 (vol. II), and 0-13-474222-2 (vol. III).
Uniform Resource Locator (URL). A unique address which fully specifies the location of a file or other resource on the Internet. The general format of a URL is protocol://machine address:port/path/filename. The port specification is optional, and if none is entered by the user, the Web browser defaults to the standard port for whatever service is specified as the protocol. For example, if HTTP is specified as the protocol, the Web browser will use the HTTP default port. The machine address in this example is the domain name for the computer or device on which the file is located.
World Wide Web (“Web”). Used herein to refer generally to both (1) a distributed collection of interlinked, user-viewable hypertext documents (commonly referred to as “Web documents”, “Web pages”, “electronic pages” or “home pages”) that are accessible via the Internet, and (2) the client and server software components that provide user access to such documents using standardized Internet protocols. Currently, the primary standard protocol for allowing applications to locate and acquire Web documents is the HyperText Transfer Protocol (HTTP), and the electronic pages are encoded using the HyperText Markup Language (HTML). However, the terms “World Wide Web” and “Web” are intended to encompass future markup languages and transport protocols which may be used in place of or in addition to the HyperText Markup Language and the HyperText Transfer Protocol.
More Specific Background
As the popularity of the Internet and the World Wide Web has continued to increase over the years, companies continue to try finding ways to provide useful content and to promote their products and services in a cost-effective manner and to get consumers to visit their Web sites. To that end, computer on-line services often offer subject search services to their users and employ narrative descriptions of their content, user ratings and user comments. Previous examples of such services include epinions.com, deja.com and travelpage.com. These prior systems present a number of limitations and drawbacks to the consumer user of the system. Specifically, a consumer cannot search for a subject based on opinions or ratings of the users of the system. Instead, the search logic is either hierarchical, based on predefined classifications, such as geography, or text based using a search for ambiguous words or phrases contained in the subject's title or description. Users' opinions and ratings are normally not finely detailed nor measurable and are separate and unrelated and are not included in the search processes offered to users. Therefore a user is unable to search for a subject based entirely or partially on the users' opinions or ratings.
No prior system having a database content of subjects provides for obtaining, storing, and/or searching using user-chosen-and/or-user-provided natural-language terms potentially descriptive of a subject (i.e., potentially-descriptive natural-language “tags” or indicators to associate with particular subjects of the database. And no such prior system provides for a user to assess at least one measure of descriptive value to the assessing user of a particular such tag with respect to a particular database subject. And no such prior system provides for associating (by “indexing”, for example) among such subjects, such tags, and/or such measures to, for example, provide enhanced statistical aggregation and/or “personalized” searches/databases of value/pleasure to a particular user.
Although computer on-line information services allow guests to personalize or customize the information displayed to them on initial entry to the site, such personalization is limited because it does not allow for consideration of the guest's interests and related opinions and ratings of the other users. Rather, the personalization is based on personal preferences in specific, rigid categories of information defined by the information service provider based on the search indexes of the database. As a result, information is presented across a spectrum of subjects that are of interest, but without regard to a user's measures of importance/relevance.
From the perspective of the consumer, the above-described model presents a number of drawbacks. First, highly structured hierarchical search rules force users to search in predetermined ways, and text-based searches rely on ambiguous words or phrases and focus on names or subjects, not concise descriptions and user's evaluations, making identification and selection of the most relevant content (to a particular searcher) difficult. Second, because Internet-based searches are either very rigid or very loosely structured, it is difficult for users to compare similar subjects across the spectrum of their interests. Finally, the quality, freshness and completeness of the database of information must be raised while minimizing costs.
Present on-line information systems also present shortcomings for the system operators and managers. Specifically, they require a high degree of human intervention to maintain. On-line information service providers permit users to comment on and rate subjects within their site and routinely remove those that are out of date or inappropriate either manually or by automated means based on the age of the comment or rating. However, the current methods lack precision because of the ambiguous nature of the ratings and comments. The ambiguity requires a high level of human intervention if the information is to remain current and appropriate.
On-line information service providers use groups or “populations” or “communities” of contributors, i.e., a population of users, to input and maintain the subject content of the database. These communities may be organized geographically or by subject matter expertise. These communities require significant effort and human intervention to manage. On-line information service providers accept content from users and contributors with little or no review before it is posted. Reviews done by humans are usually completed by a limited group who are subject matter experts or geographically close to the submitter. Substantial effort is required to manage this process.
Moreover, prior on-line information systems include incentive systems that have drawbacks. On-line information service providers provide incentives in a variety of forms to encourage contributors to input and maintain subject content. Incentives may also be offered to users of the service. On-line information service providers also employ automated processes to capture, summarize and report the accumulated incentives. The granting of the incentives is based on completion of a limited number of actions that have limited influence on contributors' behavior. There is no limit on the total amount the information service provider is obligated to pay. Each contributor's incentive value is calculated using a rate per action which makes it difficult to increase the value because it increases the total potential obligation and, conversely, lowering the rate per action will be a major disincentive to contributors. Moreover, prior on-line information service providers offer no or limited incentives for users to provide new information, ratings or opinions to the database. Conversely, users' access is not restricted to the information unless it is a fee-based subscription site. Users' behavior is little influenced by the incentives except when attempting to “game” the system and gain unfair or improper rewards.
Therefore, there exists a need in the art for an improved system for creating, managing and searching information databases assisting a population of users.