Data compression, for example for archiving to a tape, is increasingly important to users. It is vital to insure that compression and storage operations, or retrieval and decompression operations, are all carried out with the highest integrity, so that the original data presented to the system for archival storage can be retrieved correctly. In other words, the information written to the storage medium should be a correct compressed representation of the original data. Similarly, when this stored information is retrieved and decompressed, there should be no additional errors introduced during the decompression. Similarly, any transfers of information from one part of the system to another or buffering of information within the system should not introduce errors.
Providing high data integrity is a problem particularly for "high end" systems, where the data being compressed and stored is valuable or fundamental to a user's operation. Example of such systems are large scale financial data processing applications such as banking or credit card processing. Users who must store such data are willing to purchase more expensive systems in order to ensure greater integrity of their data.
Although error correcting codes are a conventional method of ensuring errors are not introduced into the data, these codes are inappropriate for providing integrity during compression and decompression. Instead, such codes are primarily used to account for other sources of error, such as defects in the storage media. Consequently, most error correction codes are merely appended to and based on the data input to the storage media. As a result, if an error were to occur in the data compression process, an error correction code will be formed using the incorrect compressed information. The error correction code will preserve this error as it is written to and subsequently retrieved from the storage medium.
One conventional method for providing high data integrity is to build a separate data compressor and decompressor. These can then operate at the same time. The data to be stored is provided to the compressor by a host system. A sumcheck is built on this data as it passes into the compressor. As the compressed result is output from the compressor, it is provided to both the decompressor and to the storage medium, such as a tape. The decompressor reconstructs the original data. As the reconstructed data is output from the decompressor, it is discarded. However, a sumcheck is built on the reconstructed data. This decompressor sumcheck is then compared with that from the original data fed into the compressor. If the sumchecks agree, there is a very high probability that the data integrity has been preserved throughout the compression/decompression process, and the system can proceed to process more data.
Note that this conventional system could also be implemented using a buffer. Different sections of this buffer can be used to store data in different states. For example, one section might store uncompressed data as received from the host over the interface and another section might store compressed data which is awaiting being written to the storage medium. However, such a scheme usually includes error checking to ensure that the data is not lost or corrupted during buffer storage/retrieval operations. In addition, checks must still be performed on compression/decompression operations, to ensure their integrity.
Although the conventional system as described above does ensure data integrity, it is rather expensive and only a relatively small number of such systems are expected to be made. A simultaneous compression/decompression capability is only required by this class of product. The same is true for the dual host interface capability. As a result, a highly integrated and specialized chip is required for a relatively low volume product. Most integrated circuit manufacturers are reluctant to produce such components in relatively small quantities.
At the "low end" range of such storage systems, a different situation prevails. Although data integrity is still desirable, the prices of such storage systems are much lower. The number of systems sold is orders of magnitude greater than for high end systems. Because of the cost of the conventional chip described above, it cannot be used in these products. In addition, the low end systems typically require only one interface because only one host is attached. Accordingly, what is needed is a system and method for providing low end systems with a compressor, decompressor and single interface, using simpler and less expensive circuitry, while ensuring the data integrity of high end system. Ideally, this same component could also be used in the high end systems to provide an equivalent or better data integrity level than the conventional solution. It would be desirable for the component in high end systems to include an option to provide multiple interface capabilities. Low end systems could then benefit from the low cost of such a component, with the high end data integrity as an additional competitive advantage. High end systems could also benefit from the lower costs associated with manufacturing economies of scale for the much higher low end system chip volumes. The present invention addresses such a need.