1. Technical Field
The present invention relates generally to computer systems and more particularly to memory compression in computer systems.
2. Background Art
A computer system typically includes a processor and volatile memory, which stores data for computer programs (programs) being executed by the computer system. The volatile memory includes a main memory typically using dynamic random-access memory (DRAM) technology. The volatile memory might additionally include caches, typically using static random-access memory (SRAM) technology. The computer system also includes one or more persistent storage devices, such as hard disks, optical storage disks, magnetic tape drives, etc., which have high memory storage capacities.
In operation, a computer program reads and/or writes data stored in the persistent storage devices using an Operating System (OS) running on the computer system. The OS invokes appropriate software modules to identify the locations on the persistent storage device where data being accessed is stored. Since access to the persistent storage device is very slow, the OS reads data in large units, known as “data blocks”, and stores them in the main memory. The part of the main memory reserved by the OS for storing these data blocks is known as “buffer cache”. The OS then passes the requested part of the data to the program by copying it into memory spaces of the computer program. The computer program then performs its read and write operations on the data in the main memory. If the computer program subsequently accesses data items in the same data block that were fetched earlier from the persistent storage device and that are still present in the buffer cache, the OS can provide these data items very quickly when compared to having to access the persistent storage device.
There are two typical approaches in which the buffer cache can identify the data blocks it stores. The first approach involves using the location of the data blocks on the persistent storage device, such as the physical addresses on a hard disk. The second approach involves using an abstraction provided by a file system. In this case, the identity of the data blocks will be logical addresses, namely the offset of the data blocks from the beginning of the file system.
The buffer cache has limited storage capacity compared to the large amounts of data that the computer programs can operate on for two main reasons. First, the buffer cache is part of the main memory, which is more expensive and hence has less storage capacity available than the persistent storage device. Second, the OS allocates only a part of the main memory to the buffer cache. Therefore, when working with large amounts of data, a program often finds that the data blocks it needs to read or write are not present in the buffer cache. This occurrence is referred to as a “miss” in the buffer cache as opposed to a “hit” when the data block is actually present. The miss forces the OS to temporarily stall the program while the data blocks are brought into the buffer cache from the persistent storage device.
Since the buffer cache contains only a small amount of storage capacity compared to the amount of data in the persistent storage device that will be accessed by the program, the buffer cache often runs out of storage space. Therefore, a replacement algorithm must be run to decide which data blocks in the buffer cache must be replaced to bring in the required data blocks from the persistent storage device. A simple policy for replacement of data blocks could be replacing the least-recently-used data. Some of the data blocks selected for replacement might contain data changes written by the program. Hence, they must be written back to the persistent storage device before they are replaced. These data blocks cannot be used for reading data from the persistent storage device until the changed data is written out first. This will delay the execution of the program further. To avoid this scenario, the OS periodically selects data blocks containing data written into by the program and schedules a single write operation to flush their data out to the persistent storage device. This allows the subsequent selection of these data blocks by the replacement algorithm without having to wait for the data to be written out. In most cases, however, the program will be delayed because it needs to wait for the data it requires to be read from the persistent storage device.
It should be noted that the latency of accessing a persistent-storage device, such as a hard disk, may be several orders of magnitude more than that of accessing memory such as the main memory or the buffer cache. With current technology, a hard-disk access might have a latency of several milliseconds, while a memory (main memory or buffer cache) access may take only several tens of nanoseconds. To hide the latency of accessing the persistent storage device, the OS can schedule some other task to run on the processor. However, this may not be possible for computer systems dedicated to one main task, such as computationally demanding computer simulations. The latency can be minimized if the access pattern can be predicted, allowing the OS to schedule the transfer of data blocks from the persistent storage device into buffer cache ahead of time. However, such prediction is not always successful.
Thus, programs run on computer systems have performance problems when accessing large amount of data stored on persistent storage devices because the access to persistent storage devices are much slower than primary storage devices such as the main memory. It is not always desirable to add extra memory due to factors such as cost or limitations imposed by the physical system design. The lack of adequate main memory can severely degrade the performance of these programs.
One prior solution for speeding up access to data stored on persistent storage devices involves modifying the OS to improve the performance of the buffer cache. However, such solution is both costly and time-consuming.
Another prior solution involves adding DRAM or SRAM to the controller for caching data blocks in persistent storage devices like hard disks. Compression techniques are applied to the hard disk cache using hardware or software running on a coprocessor. However, accessing these persistent storage devices is still quite slow.
Thus, there is a need for a system and a method of operation that would improve the performance of computer systems, which execute programs that require accessing large amount of data from persistent storage devices. This has been a long-term need and solutions have long eluded those skilled in the art.