1. Field of the Invention
This invention relates generally to the management of sharable files, and more particularly to a file consistency protocol for maintaining consistency among sharable files.
2. Description of the Related Art
In today""s computing environment computers are increasingly used in a networked environment. As an example, a majority of workplace users are connected to an internal network or system where each user works from a station, such as computer, connected to a server or servers that are part of the network. Each user""s computer contains various applications that operate on files which are stored on the server or on their own station, which may offer analogous services. Therefore, a first user may access the server or analogous device to use a file by way of an application on the first user""s machine. In addition, a second user may also access the same file and use the contents of the file with an application on the second user""s machine. Nonetheless, the situation arises in this shared environment where the first user changes data in the file which is also being used by the second user. As such, file consistency is necessary to ensure that all copies or instances of the file in use contain the same information as updates are performed to each instance of the file.
Currently, the methods available to maintain file consistency among multiple copies of a file include making an independent copy of the file and then manually or explicitly programmatically merging the changes back into the original file; a UNIX type file consistency scheme where an original file is locked after changes are made to a copy; and a distributed file system approach. Each of these prior art techniques will now be discussed in more detail.
As mentioned above, one method of ensuring file consistency requires that an independent copy of the file with a different file name be made. When the user writes changes to with the independent copy, the user must manually integrate the changes made to the independent copy back into the original file, which may be stored on a server. Of course, the changes cannot be merged back into the file while the original copy is in use by another user. As can be appreciated, integrating the changes between files requires significant effort on the part of the users accessing the file, decreases efficiency, and increases the amount of disc space necessary to store the various file copies. Furthermore, when the user makes an independent copy, the independent copy is required to be saved under a different file name (or in a different place, e.g., directory, disk, system, etc). Thus, a subsequent user may not know the new file name or location of the independent copy and may inadvertently access an outdated, invalid copy.
The UNIX operating system also has a method for maintaining file consistency among different copies of files accessible by many users. If multiple copies of a file are being accessed by different users, the UNIX operating system, or applications using the UNIX file system may automatically lock the file so that all users are prevented from saving changes to the file. In order to save changes, the UNIX operating system, file system, or application will also require the user to save the changes to an independent file having a different filename, thereby creating a copy. Thus, a user must wait until all others using the file have closed the file (i.e., making the file no longer shared), or the user must save the file under a different name, which is undesirable for the previously mentioned reasons.
In addition, UNIX allows a user to change the file attributes of the file such that the file is writable by everyone at all times and copies of the file containing different data exist on the UNIX system. As a result, when a file is writable, any modifications made by a first user are independent from other modifications made by a second user. Therefore, the modifications made by the first user do not appear on the copy in use by the second user. By the same token, any modifications made by the second user do not appear on the copy of the file being used by the first user. Thus, all the changes made by individual users are saved to individual copies and are not saved to the original file. Consequently, the original file, and copies of the original file, do not contain the modifications made by both the first and second, etc. users.
As also mentioned, another approach used to facilitate data consistency of a shared file is to use a distributed file system. Under this approach, the distributed file system distributes file data over multiple computers thereby allowing the data to reside at multiple locations simultaneously. With this approach, a user A may have a file A stored on a hard drive of user A""s computer. If a user B wants to access the file on user A""s computer, the distributed file system copies the file A to user B""s memory or hard drive. In this situation, when user B is accessing the file A, a flag is set in the file A which informs the distributed file system that the user B is making changes to the file A. The flag enables the locking of the file A such that when user A attempts to use the file A in a manner that conflicts with user B""s usage (i.e., changing the data in the file A), this setting prevents user A from changing the data. When the user B closes the file A, and the user A attempts to write to the file A, the distributed file system informs the user A that changes have been made and the distributed file system will ask the user A if the user A wants a copy of the file A containing the changes. However, this scheme is inefficient because the distributed file system must track who is modifying the file and the distributed file system must ask individual users, such as user A, when they are trying to access the file A, if the user A wants a fresh copy of the file A containing the changes. Additionally, the inefficiency is further compounded since the distributed file system must provide each user with a new copy in each instance when a separate user attempts access to the file. Furthermore, the distributed file system is not always capable of globally providing a new copy of the file containing the changes to all users which, in some situations, may number hundreds of users. In summary, under the distributed file system approach, the user A and the user B may not simultaneously write to the file A. Instead, if the user A is working with the file A in an exclusive manner, the user B must wait for user A to close the file A before user B may access file A. The file A can still be opened, however, the file will be read-only (i.e., no writing is allowed), and its contents are not guaranteed to be valid/current.
An example of a distributed file system is an Andrew file system. The Andrew file system allows a user to share files over a network and a subsystem. However, the Andrew file system requires an undesirable amount of overhead, such as CPU cycling time and increased network traffic. In addition, overhead is further increased due to the increased amount of user intervention involved with the Andrew file system. The Andrew file system requires user applications to communicate with one another before certain actions, such as data modifications, are commenced. Therefore, as the number of shared users on the network using the Andrew file system increases, the complexity of operating an Andrew file system increases.
Another problem with shared environment platforms where users access files on a shared storage medium relates to file locking and file sharing. An example of the current locking/sharing mechanisms available are a revision control system (RCS) and a source code control system (SCCS). With these approaches, a user C can open a file C in a shared mode and a user D may also access the file C. However, when the file C is opened in the shared mode, neither user may ever modify existing data in the file C or add data to the file C. Therefore, the file C (i.e., while in shared use) becomes a read-only file with the current locking/sharing mechanisms.
As may be seen, none of the prior art techniques previously described enable a real time sharing environment where multiple users may simultaneously read the same file and write to the same file.
In view of the foregoing, there is a need for methods that enable users to simultaneously open files and work with those open files at the same time. These methods should also enable other instances of a file to be updated with modifications to maintain consistency among file copies with minimal overhead and without prior knowledge by user, application or file system.
Broadly speaking, the present invention fills the aforementioned needs by providing methods for maintaining file consistency among instances (i.e., sharable copies) of a file. A file is broadly defined to include, for example, a data file, a disk volume, directory, a special file (e.g., such as a UNIX device node), etc. It should be appreciated that the present invention can be implemented in numerous ways, including as a process, an apparatus, a system, a device, a method, or a computer readable media. Several embodiments of the present invention are described below.
In one embodiment, a method for maintaining consistent file data for shared files is disclosed. The method includes defining a set of consistency bits. Once the consistency bits are defined and implemented by a file system managing the shared files, the method includes changing a file associated with the file system. The method then sets a bit of the set of consistency bits on an instance file to reflect that a change was performed to the file. An update representing the bit change(s) is then communicated to the instance file such that the instance file software is notified of the change made to the file. The appropriate action is then taken to update other files to match the instance of the file which was modified. The data transfer can be performed by way of a pulling protocol or a push protocol, which transfers the updates to one or more instances to maintain consistency for the shared file.
In another embodiment, a method for maintaining file consistency between file data associated with a file managed by a file system is disclosed. The method includes integrating a file consistency protocol with an operating system (O/S). The file consistency protocol maintains file consistency of copies of the file associated with the file system with a set of consistency bits. As attributes of the file or data within the file are changed, a bit within the set of consistency bits is set on a instance file to reflect the change performed to the file attributes. The change is made in accordance with the file consistency protocol that operates based on the settings of the set of consistency bits. After the changes are made to the file, the instance file is updated using the file consistency protocol such that the instance file obtains the change made to the file.
In yet another embodiment, a method for maintaining file consistency between multiple copies of a file is disclosed. The method includes associating a file consistency protocol with an operating system (O/S). The file consistency protocol maintaining file consistency between the multiple copies of the file associated with a file system using a set of file consistency protocol bits. The method further includes performing an action on the file to change the file and determining the type of action performed on the file. A bit in the set of file consistency protocol bits in the multiple copies of the file is set such that the set bits reflect the action performed on the file. The method then proceeds to communicate an update to the multiple copies of the file having the set bit such that the multiple copies of the file contain the change performed on the file.
In still another embodiment, a method for maintaining file consistency between multiple copies of a file is disclosed. The method includes integrating a set of file consistency protocol bits to a file system that manages the file and the multiple copies of the file. An action is then performed on the file to change the file. The method sets a bit in the set of file consistency protocol bits in the multiple copies of the file such that the set bits reflect the action performed on the file. An update is communicated to the multiple copies of the file having the set bit such that the multiple copies of the file contain the change performed on the file.
In a further embodiment of the present invention, a file consistency protocol, which is integrated with an operating system, maintains file consistency of files managed by a file system. The file consistency protocol includes a set of file consistency bits that correspond to actions performed on a file managed by the file system. The set of file consistency bits are configured to be set when one of the actions is performed on the file. Furthermore, the file consistency protocol includes a plurality of instances of the file where the set of file consistency bits are set for each of the plurality of instances of the file such that each of the instances are updated when the actions change the file.
The advantages of the present invention are numerous. Most notably, embodiments of the present invention enable consistency of file content between shared files (e.g., file instances) being managed by a file system. Thus, when multiple copies of a file are in use and one of the multiple copies are changed, the file consistency protocol ensures that the other multiple copies of the file will also be updated (in accordance with a defined bit map code) when one of the multiple copies is changed. In addition, the present invention allows for an open shared mode to enable multiple users to view copies of the file simultaneously, while at the same time ensuring that consistency is maintained as changes are made by users of instances of the shared file.