1. Field of the Invention
The present invention generally relates to data storage systems and methods, and, more particularly, to methodologies for internally consistent system-wide file system image in a distributed object-based data storage system.
2. Description of Related Art
With increasing reliance on electronic means of data communication, different models to efficiently and economically store a large amount of data have been proposed. A data storage mechanism requires not only a sufficient amount of physical disk space to store data, but various levels of fault tolerance or redundancy (depending on how critical the data is) to preserve data integrity in the event of one or more disk failures. One way of providing fault tolerance is to periodically take images or copies of various files stored in the data storage system to thereby store the file data for recovery purposes in the event that a disk failure occurs in the system. Thus, imaging is useful in facilitating system backups and related data integrity maintenance activity.
The term “image”, as used hereinbelow, refers to an immutable image or “copy” of some or all content of the file system at some point in time. Further, an image is said to be “internally consistent” or “crash consistent” if it logically occurs at a point in time at which no write activity is occurring anywhere in the data storage system. This guarantees that no files are left in an inconsistent state because of in-flight writes. On the other hand, an image is said to be “externally consistent” if the file system interacts with an external application program to assure that the external program is at a point from which it can be restarted, and flushed all of its buffers to storage, prior to taking the image. Both internal and external consistencies are desirable in order to guarantee that an image represents data from which an application can be reliably restarted.
The term “time smear” as used herein refers to an event where an image of the files in a distributed file processing system is not consistent with regard to the time that each piece of data was copied. In other words, “time smear” is the name for the effect of having two files in the image, wherein the contents of file A represent file A's state at some time T0, and the contents of file B represent file B's state at some other time T1≠T0. Such time smear in the images of different data files may occur when the data files are stored at different storage locations (or disks) and the server taking the image accesses, for example, separate data storage facilities consecutively over a period of time. For example, an image of a first data file in a first data storage facility may be taken at midnight, whereas an image of a second data file in a second data storage facility may be taken at one second after midnight, an image of a third data file in a third data storage facility may be taken at two seconds after midnight, and so on.
Another adverse result of time smear occurs when a single data file is saved on multiple machines or storage disks. In such a situation, if a save operation occurs nearly simultaneously with the image-taking operation, portions of the information contained in the image may correspond to different saved versions of the same file. If that file is then recovered from the image, the recovered file may not be usable because it contains data from different saves, resulting in inconsistent data and causing the file to be potentially corrupt.
Therefore, it is desirable to devise an image-taking methodology that substantially eliminates the time smear problem associated with prior art image mechanisms. To that end, it is desirable to obtain a time-wise consistent image of the entire file system in a distributed file processing environment. It is also desirable to simultaneously store multiple images online and to delete any image without affecting the content or availability of other images stored in the system.