1. Field of the Invention
The present invention relates to a computer system architecture for synchronizing changes to a package of files on a first computing system with index data, or database, on a second computing system that maintains data descriptive of the package of files. The present invention also relates to a computer system architecture for synchronizing two remote, and independent, computer servers intended for maintaining duplicates of each other's files.
2. Description of the Related Art
In computing networks where a first machines is used to maintain a database, or index data, of a plurality of files stored in one or more other machines, it is important that the data in the database accurately reflect the current state of the plurality of files stored in the other machines. That is, changes to the files stored in the other machines should be accurately reflected in the database stored in the first machine. This is especially true when the first machine is used as an interface for a network server and its job is to provide accurate information of, and access to, the files stored in the other machines. Such a network server architecture may be used, for example, to implement a an image server system.
As it is known in the art, an operating system often assigns an updated time-stamp to a file when the file is modified. Thus, one way of reducing the amount of mismatch between the database in the first machine and the image files stored in the other machines is to store each file's modified time-stamp along with other characteristic data for each file in the database. In this manner, the database can determine if its stored characteristic data associated with a specific file is accurate by comparing the file's stored time-stamp with the stored file's actual modified time-stamp. This approach, however, requires that the database implement a separate time-stamp comparison for each file, which can considerably slow down a system if the number of files is high.
Additionally, since the image files are stored in separate machines, it is possible to update, erase, or add files to the other machines without informing the first machine that maintains the database. In this case, if a new file is added, the database will have no way of knowing of the change unless it is explicitly informed of the addition. This is because the database's only method of synchronizing itself with the stored image files is through the use of each file's previously stored modification time-stamp. But if a new file is added to one of the other machines, then no data regarding the new file is yet stored in the database, and the database can thus not discern any changes. The database therefore remains ignorant of added new file.
This is also the case with other simple changes such the renaming of a file. Unless the database is explicitly informed of the change, it will have no way of synchronizing itself to the change.
The prior art thus requires that the database be explicitly informed of any changes to the stored files, on a change-by-change basis. Furthermore, the database itself has no way of identifying some types of changes, as recited above, and will thus remain ignorant of those types of changes if the communication link between the database and an other machine in which the change takes place is broken such that the other machine is not able to inform the database of the change at the time the change takes place.
This synchronization problem is exacerbated as the database itself is copied onto multiple remote servers. The database may, for example, be part of a local server, and the other machines on which the files are stored may likewise be local to the local server. The local sever may provide remote users access to the stored image file through a networks such as the internet. However, in cases where the number of remote users, or the distance between remote users and the local server is large, access to the image files may be slow.
In such cases, it is often useful to have an additional remote database network to help service the remote users. The remote database network acts as a mirror site, and consists of a remote database having a remote index of files stored on remote servers that hold copies of the image files. However, It is possible to independently alter files on either the local server or the remote servers. Therefore, it becomes more difficult to assure continuity between the local server, which maintains the local database network, and the remote database network of the remote servers. This problem is exacerbated in cases where no direct link is maintained between the local database network and remote database network such that the local and remote networks cannot inform each other of changes as they occur. Additionally, since the local and remote networks do not maintain a constant communication link with each other, they each execute their computing tasks independently according to their own respective local clocks. This further complicates the synchronizing of changes among the multiple servers since they do not have a common clock reference with which to track changes by comparing time stamps of file modifications with independent clocks. In other words, since the remote servers function independently, and each may be in different time zones, the modification time-stamp associated with a file modification cannot be directly compared to determine which server has the most recent version of a file.