This invention relates to a method and system for providing a backup of data. More particularly, this invention relates to a method and system for providing a backup of data of a memory, in compressed form, to at least one storage medium using a pipelined compress-and-write operation.
A backup or dump in a computing system is the printing or the copying of contents of a volatile memory, such as a random access memory (RAM), to a more permanent storage medium, such as a hard disk. Such a dump is made for purposes such as debugging a program or for providing a backup of operational data during a system crash. For the latter purpose, the length of time taken to perform the dump determines the downtime of the computing system. It is therefore desirable to reduce the time it takes for such a dump so as to reduce the system downtime.
Some computing systems, such as high-end servers, require the backup of a large amount of data which may be in the range of about 256 Gbytes. With an average dump speed to a hard disk of about 4 Mbytes per second, it takes approximately 18 hours to complete a dump of the data. An 18-hour downtime for a computing system is considered by many to be too high and unacceptable by today""s system availability standards. A dump of a selected portion of the data instead of all the data is also not an option in many systems.
Compressing the data in the memory and dumping the smaller-sized compressed data is one technique that is used for providing a backup of data for the memory. An example of where such a technique is used is disclosed in the U.S. Pat. No. 5,734,892, Chu, entitled xe2x80x9cEfficient Method and Apparatus for Access and Storage of Compressed Data.xe2x80x9d According to the patent, portions of a data file are compressed until they reach a logical block size which matches a given block size on a storage medium, such as a sector or segment of a hard disk. The portion of compressed data is stored into a sector allocated to it, and a table is built correlating the range of original data to the sector storing the compressed data. In this way, data is initially compressed into a block size which matches the characteristics of the particular storage medium used. Thus the method efficiently stores compressed data by filling allocated sectors. However, such a method requires sequential compression of data and writing of compressed data to the storage medium. Writing of compressed data commences only after the compressing of data reaches the block size. And further compressing of data in the data file to produce another similar sized block of compressed data begins only after the earlier compressed data has been stored to an allocated sector as shown in FIG. 6 of the patent. Concurrency cannot be exploited using this technique.
According to an aspect of the present invention, there is provided a method of providing a backup of data of a memory portion, by at least one compressor and writer pair, to at least one storage medium having a plurality of segments. The method includes partitioning the memory portion into a number of memory blocks. The compressor compresses data, block by block, to produce compressed data for each block. The writer writes the compressed data for each block to an associated segment of the storage medium. Compressing and writing are synchronized and occur in a pipelined manner with the compressor being able to compress data of a next block without having to wait for the completion of writing of compressed data of an earlier block to the storage medium.
According to another aspect of the present invention, there is a program storage device readable by a computing device, tangibly embodying a program of instructions, executable by the computing device to perform the above method for providing a backup of data of a memory portion, by at least one compressor and writer pair, to at least one storage medium including a plurality of segments.
According to yet another aspect of the present invention, there is a system for providing a backup of data. The system includes a memory portion that is partitioned into a plurality of memory blocks, at least one storage medium having a plurality of segments defined thereon, and mapping means for associating each of the plurality of memory blocks with one of the plurality of segments. The system also includes at least one compressor for block-by-block compression of data of the memory blocks and at least one writer associated with the compressor to define a compressor and writer pair. The writer operates in synchronization with the compressor for writing compressed data of each block to an associated segment in a pipelined manner.