As known to those skilled in the art, the term “Internet” refers to the vast collection of computers and network devices that use various protocols to communicate with one another. A “client” computer connected to the Internet can download digital information from “server” computers connected to the Internet. Client application software executing on client computers typically accept commands from a user and obtain data and services by sending requests to server applications running on server computers connected to the Internet. A number of protocols are used to exchange commands and data between computers connected to the Internet. The protocols include the File Transfer Protocol (FTP), the Hypertext Transfer Protocol (HTTP), the Simple Mail Transfer Protocol (SMTP), and the “Gopher” document protocol.
The HTTP protocol is used to access data on the World Wide Web, often referred to as “the Web.” The World Wide Web is an information service on the Internet providing documents and links between documents. The World Wide Web is made up of numerous Web sites located around the world that maintain and distribute electronic documents. A Web site may use one or more Web server computers that store and distribute documents in one of a number of formats including the Hypertext Markup Language (HTML). An HTML document contains text and metadata such as commands providing formatting information. HTML documents also include embedded “links” that reference other data or documents located on any Web server computers. The referenced documents may represent text, graphics, or video in respective formats.
A Web browser is a client application or operating system utility that communicates with server computers via FTP, HTTP, and Gopher protocols. Web browsers receive electronic documents from the network and present them to a user. Internet Explorer, available from Microsoft Corporation, Redmond, Wash., is an example of a popular Web browser application.
In a networked computing environment, such as the Internet described above, some computer systems are configured to maintain a number of databases having common data. For example, Web servers that transmit a substantial amount of data to client computers utilize database designs configured to store the same data on the client computers as well as on the server computer. This duplicated database configuration allows client computers to perform certain operations without having to establish a network connection with a particular server computer. A duplicated database configuration is also well suited for client computers that are connected to a network through a slow data connection or via temporary connections such as a remote telephone connection.
One illustrative example of a computer system that is configured to maintain a number of databases having common data can be found at a financial Web site, such as one provided by Microsoft Corporation at the Web address, MONEYCENTRAL.MSN.COM. The Web server for the MoneyCentral Web site utilizes a duplicated database similar to the one described above. This configuration allows a client computer to perform certain operations using the information stored on the client computer database without having to establish a network connection to the Web server. The duplicated database located at the MoneyCentral Web site requires communication between the client and server computers to synchronize the databases.
Various Web sites having large duplicated databases, such as the one at MoneyCentral, require the client and server computers exchange a substantial amount of data. For instance, each time a client computer modifies one object in the client computer database, the client computer establishes a connection to the server computer and transmits the new data to the server database. In some situations where the client computer frequently updates the server database with small sized data packets, the two computing devices may be slowed because it is inefficient to transmit many small data packets using a large number of connections. In addition, as more data traffic is moved over the Internet, there is a continuing need to improve the efficiency of data transfer between two computers having duplicated databases.
Accordingly, there is a need for a method and system for efficiently managing and synchronizing a plurality of databases having identical information, that are stored on more than one computer. In addition, there is a need to provide a data hierarchy that facilitates efficient data transfer between a client computer and a server computer having an identical database. As databases become larger and more complex, there is an increasing need to optimize the synchronization process between the client and server computers. This need for an efficient and manageable synchronization process is further increased when server computers communicate with client computers over a large network such as the Internet. An inefficient database synchronization algorithm can cause unnecessary network traffic and cause other applications running on each computer to run inefficiently or even cause system failure.