1. Field of the Invention
The present invention relates to a data synchronization method, a program, a recording medium, and an apparatus for forming differential data by comparing two new and old files. More particularly, the present invention relates to a data synchronization method, a program, a recording medium, and an apparatus for forming differential data that is used in order to update an old file to a new file.
2. Description of the Related Arts
If updating is necessary for upgrading, backup, or the like of software on a computer or firmware on an apparatus, it is demanded to perform uploading and downloading at low costs even in a low-speed communication environment by using differential data formed by comparing a new file and an old files.
Hitherto, in the case of updating a file of a program, data, or the like to a new file, there are two methods: a method of replacing an entire old file with an entire new file, and a method of providing differential data between an old file and a new file and only updating information in the files that need updating. The updating method by differential data has an advantage such that the updating can be performed by using less information than that in the case of the method of replacing all files and is advantageous in terms of costs as the information amount is smaller in the case of transferring the updating information to locations where the old files exist. As a conventional method of forming the differential data, as shown in flowcharts of FIGS. 1A and 1B, there is a method (JP-A-4-163626) whereby two new and old files are compared from the heads, mismatching portions are classified as updated portions into three categories “replacement”, “insertion”, and “deletion”, and values obtained after the updating according to                (1) the category of the updating, or        (2) the category of the “replacement” or “insertion”are described as differential data. The term portion means data within the file no smaller than one byte and no larger than the number of total bytes in the file. Processing steps in FIGS. 1A and 1B are as follows.            S1B: The old file and new file are read out from a disk or the like.    S2B: A data comparison target pointer is set to the head of the new file, and a data reference pointer is set to the head of the old file.    S3B: A value indicated by the comparison target pointer is compared with a value shown by the data reference pointer.    S4B: If a comparison result in step S3B indicates matching, step S5B follows. If they differ, step S7B follows.    S5B: It is determined that the value shown by the comparison target pointer is a copy of the old file.    S6B: A length of copy source is stored and step S14B follows.    S7B: The data that matches with the value shown by the data comparison target pointer is searched from the data existing behind the data reference pointer.    S8B: As a result of the search, if the matching data is found, step S11B follows. If the matching data is not found, step S9B follows.    S9B: It is determined that the comparison target data is the newly formed data.    S10B: The data comparison target pointer is shifted to one-subsequent position and step S7B follows.    S11B: It is determined that the matching position was deviated due to the insertion of the new data or the deletion of the data of the old file.    S12B: After a transfer code is generated, one of a replacement code, an insertion code, and a deletion code is generated in accordance with a shape of deviation.    S13B: The data reference pointer is shifted by a distance corresponding to the deviation.    S14B: Each of the data comparison target pointers and the data reference pointers are shifted to one-subsequent position.    S15B: If the data comparison target pointer does not reach the end of the new file, the processing routine is returned to step S3. If it reached the end, the processing routine is finished.
FIGS. 2A and 2B show a replacing process in the conventional data forming method. An old file 200 in FIG. 2A is constructed by data A, B, and C. A new file 202 is constructed by data A, B′, and C. The data A and C denote matching portions 204 and 208. The data B and B′ is a replacement portion 206. FIG. 2B shows a differential data file 210 formed by the flowcharts of FIGS. 1A and 1B. First, with respect to the matching portion 204 of the data A, a transfer code 212 having a transfer code number and a copy source data length (a bytes) is generated. With respect to the replacement data 206 of the next data B and B′, a replacement code 214 having a replacement code number and a replacement data length (b bytes) is generated and differential data 216 of (B−B′) is added. Further, with respect to the matching portion 208 of the data C, a transfer code 218 having a transfer code number and a copy source data length (c bytes) is generated.
FIGS. 3A and 3B show an inserting process in the conventional data forming method. The old file 200 in FIG. 3A is constructed by data A and B and the new file 202 is constructed by the data A and B and new data C inserted therebetween. The data A and B corresponds to the matching portions 204 and 208 and the data C corresponds to the inserting portion 206. FIG. 3B shows the differential data file 210 formed by the flowcharts of FIGS. 1A and 1B. With respect to the matching portions 204 and 208 of the data A and B, in a manner similar to FIG. 3B, the transfer codes 212 and 218 each having a transfer code number and a copy source data length are generated. With respect to an inserting portion 220 of the data C, an insertion code 222 having a transfer code number and an insertion data length (c bytes) is generated and insertion data 224 is added.
FIGS. 4A and 4B show a deleting process in the conventional data forming method. The old file 200 in FIG. 4A is constructed by data A, B, and C and the new file 202 is constructed by the data A and B and the data C is deleted. The data A and C corresponds to the matching portions 204 and 208 and the data B corresponds to a deleting portion 226. FIG. 4B shows the differential data file 210 formed by the flowcharts of FIGS. 1A and 1B. With respect to the matching portions 204 and 208 of the data A and C, in a manner similar to FIG. 3B, the transfer codes 212 and 218 each having a transfer code number and a copy source data length are generated. With respect to the deleting portion 226 of the data B, a deletion code 228 having a deletion code number and a deletion data length (b bytes) is generated.
However, such a conventional data synchronization method as mentioned above, since the updating information is expressed by three categories “replacement”, “insertion”, and “deletion” on the assumption that the correct matching portions can be always found out when the matching portions of the new and old files are searched, there are the following problems.
FIG. 5 shows an example of an inserting process. The old file 200 is constructed by data A and B. The new file 202 is constructed by the data A and B and also the data C inserted there between. With respect to the data A of the old file 200, the new file 202 has two matching portions 204 and 230. However, according to the conventional data synchronization method, since the second matching portion 230 does not belong to any of the categories of “replacement”, “insertion”, and “deletion”, there is a problem with the insertion of the new data A. Because of this problem, a determination is made that all subsequent new and old files are mismatching portions, and they are generated only in the category “replacement”. Consequently, an amount of differential data increases.
FIG. 6 shows a case where a correspondence relation between the new and old files is wrong. It is now assumed that the old file 200 is constructed by data A, B, C, and B and the new file 202 is constructed by the data A, B, and C and the last data B, of the old file 200, is deleted. However, in the data synchronization process when the comparison between the data B 234 of the old file 200 and the data B of the new file 202 is made, if it is determined that the data B of the new file is erroneously recognized as data B′ 232, that is, if the updating portion is erroneously determined, a replacement code and new data B′ are erroneously generated as a category of “replacement”. In this case, even if the same data B as the data that was erroneously determined exists on the old file existing behind the position where the erroneous discrimination occurred, there is no room to utilize the data B. Upon updating of the old file, the data is erroneously rewritten to the data B′ which was erroneously determined. Further, in a program file, there is a case where by rewriting a specific program code, the same value repetitively appears as a differential value that is added to the replacement information. However, in the conventional data synchronization method, no consideration is made to such a point.