Information storage systems typically store large quantities of information as cheaply and with as fast an access time as possible. However, data storage devices that provide the most rapid access to stored information are also the most expensive in terms of money per byte stored. Consequently, most information storage systems include a hierarchical multilevel storage system. FIG. 1 shows an exemplary information storage system 1 in which the computer 5 is connected to the multilevel disk store 9. The multilevel disk store is an example of a multilevel storage system. The computer includes the central processing unit (CPU) 3, the random-access memory (RAM) 7 and the I/O interface 11 interconnected by the bus 13. The multilevel disk store is connected to the I/O interface 11 in the computer by the data link 15.
In the information storage system shown in FIG. 1, the computer 5 stores a relatively small fraction of the total amount of the data on which it operates in the fast, expensive RAM 7. The computer stores a majority of the data on which it operates in the multilevel disk store 9, and transfers data back and forth between the multilevel disk store and the RAM as it performs computations and saves the results of such computations. The multilevel disk store accesses data considerably more slowly than the RAM 7, but the cost of storing data in the multilevel disk store is much less than storing data in RAM.
FIG. 2 shows an example of the structure of the multilevel disk store 9 of the above-described information storage system 1. In the multilevel disk store, data are stored by storage devices at two different levels. The disk drives 21.sub.1, 21.sub.2 . . . 21.sub.n constitute the lower-level storage device 21. Alternatively, a single disk drive may be used as the lower-level storage device. A cache memory constitutes the upper-level storage device 27. The cache memory increases the rate at which data stored in the multilevel disk store 9 can be transferred to the computer 5, and increases the rate at which the multilevel disk store can accept data transferred from the computer. The upper-level storage device is interposed between the I/O interface 29 and the lower-level storage device. The multilevel disk store transfers data to and receives data transferred from the computer 5 via the data link 15. The data link is connected between the I/O interface 29 and the I/O interface 11 in the computer.
The storage level at which a given block of data is stored in multilevel disk store 9 depends on the frequency with which the computer demands access to read the data or to update the data. The more frequently-accessed data blocks are stored in the cache memory constituting the upper-level storage device 27, the less frequently-accessed data blocks are stored in the disk drives 21.sub.1, 21.sub.2 . . . 21.sub.n constituting the lower-level storage device 21. Since the frequency with which the computer 5 demands access to at least some of the data blocks changes with time, the multilevel disk store will change the level at which such data blocks are stored in response to information indicating a frequency of access for the blocks.
Data compression is conventionally used in multilevel storage systems to achieve two purposes, namely, to decrease the data traffic at the lower-level storage device, and to increase the amount of data that can be stored in the system. In conventional multilevel storage systems that use data compression, the blocks of data are compressed at a particular level in the system. Once compressed, the blocks of data are kept in their compressed state at all storage levels below and including the storage level at which they were compressed. Typically, data compression is not applied to the blocks of data stored in the RAM 7 of the computer 5 since the time required to compress and expand the blocks of data would significantly increase the time required to store data in and retrieve such data. However, because the access times between the computer 5 and the multilevel storage system are longer than the computer's internal access times, and because the data transfer rates between the computer and the multilevel storage system are less than the computer's internal data transfer rates, applying data compression to the multilevel storage system does not significantly impair the average time required to transfer data to and from the multilevel storage system.
When data compression is applied to the data stored in the multilevel disk store 9 in the information storage system 1, the compression/expansion engine 31 is interposed between the upper-level storage device 27 and the I/O interface 29. The compression/expansion engine compresses the blocks of data entering the multilevel disk store 9 through the I/O interface 29, and expands the blocks of compressed data read from the upper-level storage device 27 or the lower-level storage device 21 prior to such blocks being transferred to the computer 5. Consequently, the compression applied to the data stored in the multilevel storage system is transparent to the computer and to clients connected to the computer. Blocks of data are stored in the compressed state in the upper-level storage device 27. Moreover, the blocks of data are kept in their compressed state when they are moved to the lower-level storage device 21. Compressing the blocks of data helps achieve the two above-stated goals of minimizing data traffic and maximizing the amount of data that can be stored in the multilevel storage system.
The ability to apply data compression to the data stored in the multilevel storage system without impairing the data transfer rate between the computer 5 and the multilevel storage system is predicated on a wide disparity between the internal data transfer rate of the computer and the data transfer rate between the computer and the multilevel disk store 9. Moreover, using data compression requires the use of complex space management algorithms in the lower-level storage device of the multilevel storage system to cope with the inherent variability in the size of the blocks of compressed data that result from compressing fixed-size blocks of uncompressed data of different data content. Examples of such algorithms are described by F. Dougliss in The Compression Cache: Using On-Line Compression to Extend Physical Memory, WINTER 1993 USENIX CONFERENCE, pp. 519-529, (January 1993).
Recent increases in the operational speeds of storage devices such as disk drives suitable for use as the lower-level storage device 21, and recent reductions in the costs of such storage devices make the case for using data compression at all levels below a certain level in a multilevel storage system less compelling. Latency, and not the data transfer rate, is becoming the main limitation on the data transfer performance of modern disk drives. Further limitations are imposed by the complexity of space management systems required to manage variable-sized blocks in systems that are designed to operate with fixed-size blocks.
What is needed is a multilevel data storage system in which data compression is applied more effectively than the data compression arrangements of conventional multilevel storage systems.