This invention relates to digital data storage and retrieval, and more particularly to page-oriented storing of compressed or uncompressed data in randomly-accessed locations of fixed sizes in partitioned storage devices. The invention is particularly adapted for storing fixed-size pages swapped with main memory in a computer system using a virtual memory management scheme.
A computer implementing a virtual memory system typically employs a certain amount of "physical" memory composed of relatively fast semiconductor RAM devices, along with a much larger amount of "virtual" memory composed of hard disk, where the access time of the hard disk is perhaps several hundred times that of the RAM devices. The physical memory or "main memory" in a virtual memory system is addressed as words, while the virtual "disk memory" is addressed as pages. The virtual memory management scheme uses an operating system such as UNIX.TM. along with hardware including a translation buffer, as is well known. In multi-tasking operation where more than one program runs at the same time, each running in a time slice of its own, each program appears to have an entire memory space to itself. To make room in the physical memory to run a new program, or to allocate more memory for an already-running program, the memory management mechanism either "swaps" out an entire program (process) to disk memory or "pages" out a portion (page) of an existing process to disk. A typical page size is 4Kbytes.
Transferring data to and from disk memory is very slow compared to the transfer time to main memory, and so "solid state disks" (composed of semiconductor RAMs like the main memory) have been used as a substitute for magnetic disk to improve system performance. This is at a much higher cost per megabyte of storage, however, due to the cost of semiconductor RAMs. Data compression has not been used because of the variable-length record problem as discussed below, i.e., compressed data blocks are of variable size, making random access of compressed "pages" of data impractical.
As explained in application Ser. No. 627,722, now U.S. Pat. No. 5,237,460 data compression encoding algorithms are commonly applied to data which is to be archived or stored at the tertiary storage level. In a hierarchy of data storage, a RAM directly accessed by a CPU is often referred to a the primary level, the hard disk as the secondary level, and tape (back up) as the tertiary level. The characteristic of tertiary level storage as commonly implemented which supports use of compression is that the data access is largely sequential. Data is stored in variable-length units, sequentially, without boundaries or constraints on the number of bytes or words in a storage unit. Thus, if a file or page being stored compresses to some arbitrary number of bytes this can be stored as such, without unused memory due to fixed sizes of storage units. Compression can be easily applied in any such case where the data is not randomly accessed but instead is sequentially accessed. For this reason, data compression works well for data streaming devices such as magnetic tape. It has been applied to databases holding very large records on magnetic and optical disks.
Data compression is not readily adaptable for use with random access storage devices such as hard disks or solid-state disks, although in many cases it would be desirable to do so. The reason for this lack of use of data compression is that algorithms for data compression produce compressed data units which are of variable size. Blocks of data of fixed size compress to differing sizes depending upon the patterns of characters in the blocks; data with large numbers of repeating patterns compress to a greater degree than a more random distribution of characters. Text files and spreadsheet files compress to smaller units than executable code or graphics files. This problem of variable-length records has made random access of compressed data records, as managed by operating systems and controllers in computer systems, impractical.
It is the principal object of this invention to provide a low-cost, high-speed, semiconductor memory device useful in a computer implementing page swapping, as required in virtual memory computer architecture, particularly a device employing data compression to reduce cost, and using error detecting and correcting techniques to increase reliability. Another object is to provide an improved method of storing data in a computer system or the like, and particularly to provide a method of compressing data pages for storage in a storage medium having an access capability for storing data units of fixed size. Another object is to provide an improved data compression arrangement using a random-access type of storage device, where the data units to be stored and recalled are of fixed length and the storage device is accessed in fixed-length increments, where the length is small enough for this to be considered random access of data. A further object is to reduce the amount of unused storage space in a storage device when compressed data units are stored, and therefore increase the storage density. An additional object is to provide an improvement in the cost per byte of storage capacity in a storage device.