The present invention relates to data compression and more particularly to embedding meta-data in a compression stream.
Data compression systems are known in the prior art that encode a stream of digital data signals into compressed digital code signals and decode the compressed digital code signals back into the original data. Data compression refers to any process that attempts to convert data in a given format into an alternative format requiring less space than the original. The objective of data compression systems is to effect a savings in the amount of storage required to hold or the amount of time required to transmit a given body of digital information.
To be of practical utility, a general purpose digital data compression system should satisfy certain criteria. The system should have reciprocity. In order for a data compression system to possess the property of reciprocity it must be possible to re-expand or decode the compressed data back into its original form without any alteration or loss of information. The decoded and original data must be identical and indistinguishable with respect to each other. The property of reciprocity is synonymous to that of strict noiselessness used in information theory. Some applications do not require strict adherence to the property of reciprocity. One such application in particular is when dealing with graphical data. Because the human eye is not that sensitive to noise, some alteration or loss of information during the compression de-compression process is acceptable.
The system should provide sufficient performance with respect to the data rates provided by and accepted by the devices with which the data compression and de-compression systems are communicating. The rate at which data can be compressed is determined by the input data processing rate into the compression system, typically in millions of bytes per second (megabytes/sec). Sufficient performance is necessary to maintain the data rates achieved in present day disk, tape and communication systems which rates typically exceed one megabyte/sec. Thus, the data compression and de-compression system must have enough data bandwidth so as to not adversely affect the overall system. The performance of data compression and de-compression systems is typically limited by the computations necessary to compress and de-compress and the speed of the system components such as, random access memory (RAM), and the like, utilized to store statistical data and guide the compression and de-compression process. Performance for a compression device is characterized by the number of processor cycles required per input character under the compressor. The fewer the number of cycles, the higher the performance.
Another important criteria in the design of data compression and decompression systems is compression effectiveness, which is characterized by the compression ratio. The compression ratio is the ratio of data size in uncompressed form divided by the size in compressed form. In order for data to be compressible, the data must contain redundancy. Compression effectiveness is determined by how effectively the compression procedure uses the redundancy in the input data. In typical computer stored data, redundancy occurs both in the nonuniform usage of individual symbology, example digits, bytes, or characters, and in frequent recurrence of symbol sequences, such as common words, blank record fields and the like.
General purpose data compression procedures are also known in the prior art, three relevant procedures being the Huffman method, the Tunstall method and the Lempel-Ziv method. The Huffman method is widely known and used, reference thereto in article of D. A. Huffman entitled xe2x80x9cA Method For Construction Of Minimum Redundancy Codesxe2x80x9d, Proceedings IRE, 40, 10 pages 1098-1100 (September 1952). Reference to the Tunstall algorithm may be found in Doctoral thesis of B. P. Tunstall entitled xe2x80x9cSynthesis of Noiseless Compression Codesxe2x80x9d, Georgia Institute of Technology (September 1967). Reference may be had to the Lempel-Ziv procedure in a paper authored by J. Ziv and A. Lempel entitled xe2x80x9cA Universal Algorithm For Sequential Data Compressionxe2x80x9d, IEEE Transactions on Information Theory, IT-23, 3, pages 337-343 (May, 1977).
Redundant arrays of inexpensive or independent data storage devices (RAID) are being employed by the mass storage industry to provide variable capacity data storage. RAID systems use interconnected disk drives to achieve the desired capacity of mass storage. With this approach, a disk drive of one capacity may be manufactured and packaged with the same or different capacity drives to provide the required storage capacity. RAID systems eliminate the need to manufacture disk drives individually designed to meet specific storage requirements. Each disk drive in a RAID system is usually housed in an individual module for handling and installation. The modules slide into and out of a larger enclosure that houses the array of disk drives and provides the sockets, plug-ins and other connections for the electrical interconnection of the drives. Controllers orchestrate the interconnection and control access to selected disk drives for data reading and writing operations.
A RAID system is an organization of data in an array of data storage devices, such as hard disk drives, to achieve varying levels of data availability and system performance. Data availability refers to the ability of the RAID system to provide data stored in the array of data storage devices even in the event of the failure of one or more of the individual data storage devices in the array. A measurement of system performance is the rate at which data can be sent to or received from the RAID system.
Of the five basic architectures developed for RAID systems, RAID 1 and RAID 5 architectures are most commonly used. A RAID 1 architecture involves an array having a first set of data storage devices with a second set of data storage devices which duplicates the data on the first set. In the event of the failure of a data storage device, the information is available from the duplicate device. The obvious drawback of this RAID system implementation is the necessity of doubling the storage space.
A RAID 5 architecture provides for redundancy of data by generating parity data. Each of the data storage devices are segmented into a plurality of units of data, known as blocks, containing equal numbers of data words. Blocks from each data storage device in the array covering the same data storage device address range form what are referred to as xe2x80x9cstripesxe2x80x9d. A parity block is associated with each stripe. The parity block is generated by performing successive exclusive OR operations between corresponding data words in each of the data blocks. Changes to data blocks in a stripe necessitates re-computation of the parity block associated with the stripe. In a RAID 4 system, all parity blocks are stored on a single unit in the array. As a result, the data storage device containing the parity blocks is accessed disproportionately relative to the other data storage devices in the array. To eliminate the resulting constriction of data flow in a RAID 4 system, a RAID 5 architecture distributes the parity blocks across all of the data storage devices in the array. Typically in a RAID 5 system, a set of N+1 data storage devices forms the array. Each stripe has N blocks of data and one block of parity data. The block of parity data is stored in one of the N+1 data storage devices. The parity blocks corresponding to the remaining stripes of the RAID system are stored across the data storage devices in the array. For example, in a RAID 5 system using five data storage devices, the parity block for the first stripe of blocks may be written to the fifth device; the parity block for the second stripe of blocks may be written to the fourth drive; the parity block for the third stripe of blocks may be written to the third drive; etc. Typically, the location of the parity block in the array for succeeding blocks shifts to the succeeding logical device in the array, although other patterns may be used. More information detailing the architecture and performance of RAID systems can be found in the RAID Book: A Source Book for RAID Technology, by the RAID Advisory Board, 1993, the disclosure of which is incorporated herein by reference.
When data stored in the N+1 storage devices of the RAID 5 array is modified, the parity block for the stripe in which the data is located must also be modified. This modification process can occur through what is known as a xe2x80x9cread-modify-writexe2x80x9d sequence or a xe2x80x9cwrite in placexe2x80x9d sequence. In a read-modify-write sequence, the parity block is recomputed through a process of performing the exclusive OR operation between corresponding words of the data blocks forming the stripe.
A write in place sequence recomputes the parity block by removing the effect of the data currently contained in the storage locations which will be modified from the parity block and then adding the effect of the new data to the parity block. To perform a write in place sequence, the data presently stored in the data blocks having the storage locations which will be modified is read. The corresponding portion of the parity block of the stripe containing the storage locations which will be modified is read. The exclusive OR operation is performed between the data presently stored in the data blocks and the corresponding portion of the parity block to remove the effect of the presently stored data on the parity block. The exclusive OR operation is then performed between the new data and the result of the previous exclusive OR operation. This result is then stored on the data storage devices in the corresponding locations from which the portion of the parity block was loaded and the new data is stored in the stripe.
Efficiency considerations determine which one of these methods of parity block computation will be used. The factors used to determine which of these methods of parity block generation is most efficient vary with the configuration of the RAID system and the data blocks which are being modified. For example, if there are a large number of data storage devices which store portions of the stripe and changes have been made to data blocks which involve only a few of the data storage devices, the most efficient parity block re-computation method may be write in place. However, if a relatively large fraction of data storage devices are involved in the changes to the data blocks, the most efficient parity block re-computation method may be read-modify-write. The firmware controlling the operation of the RAID system determines the most efficient parity block re-computation method for each data transfer to the array.
Low cost RAID systems can be implemented by using software installed in the host computer system to perform the RAID system management functions. For this type of RAID system the host computer system manages the distribution of data blocks and parity blocks across an array of data storage devices and performs the parity block computation. As expected, this low cost RAID system implementation results in a significant reduction in the ability of the host computer system to perform its other data processing operations. High performance RAID systems use a dedicated controller to manage the data block and parity block storage in the array of data storage devices. For these high performance RAID systems, the host computer is able to interact with the RAID system as a single data storage unit.
Within the category of high performance RAID systems using controllers, there are low cost controllers and high performance controllers. Low cost controllers use the microprocessor on the controller to perform many of the data manipulation tasks. This implementation makes a trade off in the performance of the RAID system controller to reduce the cost of the controller. High performance controllers utilize dedicated hardware, such as a state machine or a dedicated microprocessor, data compression engines, to more rapidly perform many data manipulation tasks.
There are many problems that arise when attempting to compress data stored on a storage system. Perhaps one of the most significant contributors to the problems is the unpredictable compression ratio from lossless algorithms. To illustrate this issue, FIG. 1 shows several blocks 121 on the left, each of which are not compressed. The right side shows their compressed result 122.
Note that in one case (124), the compression operation actually expanded the block. This is one fundamental aspect to compressing data in a lossless fashion. The result can take more space than the original. When designing a system, this condition must be dealt with for optimal compression performance.
Now, envision each of the blocks being some set number of sectors on the disk drive or tape drive. In order to optimize the space consumed by the compressed data, the compressed blocks should be placed back to back on the disk storage regardless of the sector boundaries.
Assume that each of the uncompressed blocks represents a sector on a disk drive. There would be a total of 12 sectors used to store 12 sectors worth of data. The compressed data is stored back to backxe2x80x94virtually ignoring all disk sector boundaries as shown in FIG. 2. In a RAID (Redundant Array of Independent Disks) system, it actually wouldn""t be out of the question to view each vertical column as a disk. The compressed image could span the boundary between two disk drives. Remember that one block that didn""t compress? The one block that did not compress (124) has been stored in a different region since the uncompressed version required less disk space.
While compression ratio is maximized by storing the data in such a compact fashion, this type of organization may cause several problems. First, a compression engine as know in the artxe2x80x94hardware (used exclusively if performance is even a remote concern) or softwarexe2x80x94will create the compressed image. To reverse this process, a decompression engine must be given the compressed record. In fact, prior to the present invention, the decompression engine must be given the compressed record and nothing but the compressed record. That infers that there is some knowledge of the exact starting location of the compressed record. To store the byte address of each and every compressed record in the system can be costly if done in SDRAM and very slow if stored on disk.
Another problem shows up when users start to modify the data already stored on the disks. If a user modifies only a portion of the record, it is very desirable to only modify the portion of the record that was affected. The update of the data in place presents a problem when looking at traditional compression algorithms. Lossless adaptive compression algorithms will build a history of the data as the data is compressed. When information is gathered during the compression process, that information is used to improve compression of the remainder of the record. Therefore, each byte of data compressed depends on all of the previous bytes of data in the record. Given this attribute of lossless adaptive compression techniques, the modification of the compressed data record in place needs a different mode of operation.
As mentioned earlier, the compressed data records will not have a predictable compression ratio. This will present a problem for modifications that are made in place. There is nor space if the new compressed data is a little larger than the previously compressed data. If it shrinks, there is wasted space.
In order to accomplish the present invention there is provided a method for compressing data. The uncompressed data is received. A first meta-data marker is placed in the a compressed data. The first meta-data indicating the beginning of a data record. Compression of the uncompressed data is started. Compression may be by any algorithm. After a predefined amount of the uncompressed data is compressed, a second meta-data marker is inserted in the compressed data. The second meta-marker indicating the beginning of a decompressible data block. After a predefined amount of the compressed data is created, a third meta-data marker is inserted in the compressed data, where the third meta-data marker identifies an expansion joint in the compressed data. There is also a fourth meta-data marker that indicates to the decompressor that the identified data is to be ignored.
There is also provided a method to merge new data with already compressed data. First, the location of the additional data in the compressed data is estimated. Then the compressed data is searched for a second meta-data marker. The additional data is merged with the present compressed data. If the merged data requires more space than the size of the expansion joint is decreased, alternatively, if the merged data requires less space, than the size of the expansion joint is increased.
There is further provided an ignore meta-data used to instruct the decompressor that the following meta-data is actual compressed data and not metadata.
There is also provided a decompressor to decompress the compressed data and extract all meta-data. The compressed data is searched for the first meta-data marker. Starting at the first meta-data, the compressed data is decompressed by a decompression engine. If a second meta-data marker is found in the compressed data, then the decompression engine is reset. If a third meta-data marker is found in the compressed data, then the decompression engine is instructed to skip the third meta-data marker.