This invention relates to a method and apparatus for data compression and more generally relates to a system in which data is compressed and decompressed.
Various data compression techniques are known in the art. Some techniques are described as lossless because the compressed form of the data can be expanded to a perfect duplicate of the original data. Other techniques, however, are described as lossy because the reconstructed data may differ slightly from the original data.
Lossless data compression techniques are typically used for textual computer files, such as documents composed in English, and for files that may be executed as computer programs, because the smallest deviation in such a file can lead to unpredictable results when the program is executed. Lossy techniques are typically used for digitized data, such as digitized photographic images or digitized sound, where small deviations can go unnoticed by a human observer. By permitting some degree of loss in the reconstructed data, a much higher degree of compression is normally achieved.
Many commercial software products employ lossless compression techniques based on a data compression technique described by Jacob Ziv and Abraham Lempel in the IEEE Transactions on Information Theory, Vol. IT-23, No. 3, May 1977. U.S. Pat. No. 4,701,745, issued Oct. 20, 1987 to Waterworth and U.S. Pat. No. 5,016,009, issued May 14, 1991 to Whiting disclose efficient implementations of this technique for use as computer programs. Basically, the Ziv-Lempel technique operates by maintaining a history buffer used to hold a copy of the most recently processed data and by searching the history buffer to locate repetitions of data. Such repetitions are encoded and provided as compressed output data.
It would be possible to apply a lossless data compression technique such as the aforesaid Ziv-Lempel technique to, for example, a digitized electrocardiogram (ECG) waveform, however, very little compression would be produced because exact repetitions of data would seldom be found. Electrocardiogram signals generally have a recurring shape but, their amplitude may vary slightly over time, within tolerances, and noise accompanying such signals could cause amplitude distortion. Generally any data that has been obtained from an analogue source and then digitized, such as that obtained in digitizing ECG signals, necessarily contains a noise component. Some signal noise is inherent in every analogue recording device and further noise may be introduced by the digitization process. Variations in amplitude and the presence of noise implies that exact repetitions of sequences of data values are unlikely to occur. Consequently, compression of data obtained from analogue sources, in particular ECG data, cannot be effectively compressed using known Ziv-Lempel techniques, which rely on exact repetitions of data values in the history buffer.
For some applications it may be desirable to modify the Ziv-Lempel technique to enable lossy compression of data from analog sources, by allowing for minor variances in signal amplitude.