FIG. 1 illustrates a conventional system for data synchronization. The system includes a server 101 that stores data in a database and a device 103 that stores a subset of the data stored on the server 101. The subset is defined by a filter 102. Occasionally, the device 103 connects to the server 101 to synchronize its copy of the subset with that stored at the server 101.
In one conventional method for data synchronization, the server 101 sends every row in the database that satisfied the filter 102 each time the device 103 requests synchronization. However, this method wastes time and network bandwidth resource, particularly valuable in mobile computing.
In another conventional method for data synchronization, the server 101 sends only those rows that satisfy the filter 102 and have been changed since the last synchronization. However, this method results in data integrity problems, known as “filter-scope out-of-sync”. The problem occurs in three ways: (1) a row inside the scope is updated so that it is outside the scope, such that the row is not deleted from the device upon synchronization (condition C1); (2) the scope of the filter changes since the last synchronization, such that a row that is outside the changed filter scope is not deleted from the device upon synchronization (condition C2); and (3) the scope of the filter changes since the last synchronization, such that a row that is unchanged but is within the changed filter scope is not inserted into the device upon synchronization (condition C3). The data integrity problems are further described with reference to FIGS. 2A-4B.
FIGS. 2A-2B illustrate condition C1 of the data integrity problem. As illustrated in FIG. 2A, assume that the server 101 has a domain table (DT) with two columns: pk (primary key), and x (data value). The device 103 subscribed DT with the filter: x>5. Initially, the device 103 is sent rows (1, 10) and (2, 18) because 10>5 and 18>5. As illustrated in FIG. 2B, assume that before the next synchronization, row (1, 10) at the server 101 is updated to (1, 2). Row (1, 2) does not satisfy the filter. In the next synchronization, row (1, 2) is not sent to the device 103. However, row (1, 10) is not deleted from the device 103. As a result, for the same primary key, the device 103 has the data value 10 while the server 101 has the data value 2. The data on the device 103 thus becomes out-of-sync with the data on the server 101.
FIGS. 3A-3B illustrate condition C2 of the data integrity problem. As illustrated in FIG. 3A, the database contains rows R1 and R2 in a domain table. The filter is: x IN (SELECT zipcode FROM Zipcode Table WHERE city=‘San Jose’). This filter applies to the domain table at the server 101 to define a subset of rows that a device receives. The filter references a look-up table (“Zipcode Table” is the look-up table in this example) on the server 101 to look up data that defines the scope of the filter. Initially, the device 103 receives rows R1 and R2 since they satisfy the filter. As illustrated in FIG. 3B, assume that the look-up table is updated such that R2 no longer satisfies the filter. For example, the zip code in R2 is reassigned to a city other than San Jose. In the next synchronization, only R1 satisfies the filter. However, R2 is not deleted from the device 103, resulting in the data on the device 103 becoming out-of-sync with the data at the server 101.
FIGS. 4A-4B illustrate condition C3 of the data integrity problem. In this example, the same domain table and filter as FIGS. 3A-3B are used. As illustrated in FIG. 4A, initially, the device 103 receives rows R1 and R2 since they satisfy the filter. As illustrated in FIG. 4B, assume that the look-up table is updated such that R3 satisfies the filter. For example, the zip code in R3 is reassigned to the city of San Jose. However, in the next synchronization, R3 is not sent to the device 103 since the row itself has not changed. Thus, the data on the device 103 becomes out-of-sync with the data on the server 101.
In another conventional method, the data integrity problem described above is addressed by having the device 103 apply a filter to find out-of-scope data after the synchronization and delete them. However, this approach does not eliminate condition C3. This approach also has additional drawbacks. The device 103 must subscribe to all look-up tables and all columns referenced in the filter, and none of the look-up tables can have a filter on it. The filter processing during synchronization thus requires additional time and network bandwidth resources. Also, the filter may not work correctly if the device 103 and the server 101 are from different vendors, since each database vendor may have its own syntax and semantic variants for the filter process.
Accordingly, there is a need for a method and system for preserving filter scope consistency in synchronizing data. The method and system should be efficient in time and bandwidth resources while also providing consistent data integrity. The present invention addresses such a need.