With the advent of computers, databases have become a ubiquitous feature of businesses throughout the world. Data has become a crucial commodity, and database information providers can sell a database or subsets of a database to customers who can utilize the information included in the database to operate their businesses. The size of compiled databases has grown significantly. Some compiled databases contain millions of records (rows) and hundreds of attribute (columns), and many of these databases are licensed to clients that are located across a wide geographical area.
The problem of how to keep all these database copies current is a significant problem. This problem is compounded with different licensees wanting only selected rows of data (i.e. only California records) and/or selected columns of data in their copy of the database. The desired licensee frequency of database synchronization requirements varies from every few days to once a year. The problem becomes more acute with increases in the number of licensees, the addition of new attributes (more columns), the frequency of updates to the master database and the desire of the licensees to keep their database copy as current as possible.
Third party clients can purchase a license for a copy of the database, and the clients can contract with the database provider to receive periodic updates to the contents of the database. Keeping all of these remote copies of the database updated can present a significant problem. This problem is compounded when different licensees receive only a selected subset of the data included in the database. Some clients may only be licensed to receive select rows or select columns of a database. For example, one licensee may contract to receive only records that include California data, while another licensee may contract to receive data from multiple states but only receive a subset of the columns included in the full source database. The frequency at which licensees desire to receive updates to synchronize their copies of the database with the database provider's source database can also vary greatly, some licensees may require daily updates or updates every few days, while others might only require yearly updates.
Keeping all of these remote licensed copies of the source database updated becomes increasingly difficult as the number of licensees increases, the number of new attributes (columns) included in the database increase, the frequency of updates to the master database (also referred to herein as the source database), and the number of licensees requiring more frequent updates increases. In a conventional system, database copies were shipped to licensees on tangible computer-readable media at periodic intervals. For example, copies of the database could be shipped on magnetic media, such as magnetic tape or disks, or optical media, such as CD-ROM or DVD-ROM. The licensee was then responsible for installing the updated copy of the database received from the database provider.
Generating and shipping copies of the database on physical media can become an arduous, time-consuming, and expensive process to generate a custom copy of a database for each licensee at each update cycle. Some database providers began offering delta files (also referred to as change files) that include only changes that have been made to the source database since the last version of the database was shipped. However, the database provider must still generate numerous different permutations of the deltas that are customized to needs of each of the licensees.
As an alternative solution, some database providers have used encrypted keys for each column of the database rather than generating deltas specific for each licensee. Each licensee is provided with the keys to unlock those columns to which the licensee is allowed to access. However, this solution does not solve all of the problems associated with generating database updates for a large number of licenses and the encryption techniques used do not meet the security requirements of many of the database providers. If the encryption is broken, the contents of the entire source database can be accessed by unauthorized parties.
Yet another alternative solution for providing database updates to licensees is illustrated in FIG. 1. In these embodiments, a data file that includes the changes to the database is constructed by a source content server and provided to a file transfer protocol (FTP) server (step 910). The FTP server can then compress the file (step 920). A licensee can then download the compressed file from the FTP server to an onsite file server (step 930) across the Internet or other network connection. The file server can then be used to decompress the file that was downloaded (step 940), and the downloaded file can be ingested by a customer content hosting server that can use the information in the file to update a database (step 950).
As the size and complexity of the compiled databases continues to grow, the ability to provide updates to the database can become increasingly burdensome both for the database provider and the clients. As a result, database updates may only be issued periodically, which can result in the client databases being out of date until the next update is issued. Furthermore, because the onus of installing the database updates at the client's site falls on the client, the database provider cannot be sure that the clients are actually making the updates to their copies of the database using the updates provided by the client. Also, there is a chance that a client could accidentally skip installation of a delta, and due to the incremental nature of such updates, cause damage to the client database. Often, the client databases become out of synch with the source database, due the various factors described above.