1. Field of the Invention
This invention relates to the field of computer system information retrieval systems, and more specifically to computer system cache memories.
2. Background Art
Loading information into a computer processor memory from a persistent storage device, such as a hard drive, introduces significant delays into computer system operation. Persistent storage caches typically contain information most recently loaded from a persistent storage device. These caches often do not contain requested information thereby requiring the information to be retrieved from a persistent storage device and delaying system operation. The following background information is presented to provide a better understanding of this problem.
A typical computer system consists of a number of modules or components. Computer systems typically include a central processing unit (CPU) such as a microprocessor. The microprocessor is a program-controlled device that obtains, decodes and executes instructions. A computer system also includes storage components for storing system operating software, application program instructions and data. These storage components may be read only memory (ROM), random access memory (RAM), disk or tape storage, or any other suitable storage means.
A computer system typically also includes input/output (I/O) components for connecting external devices to the microprocessor. Special purpose components, such as memory management units or co-processors, may also be part of the computer system.
Computers are used to process data. To allow processing of data, input data must be stored until it is to be used by the central processing unit (CPU). Also, output data must be stored after it has been processed. During some processing operations, the CPU may also require the storage of data temporarily while instructions are executed on that data. In addition, the application program that controls the processing and the operating system under which the program runs must be accessible to the CPU. This information is made available to the CPU by storing it either in a resource known as "main memory," or in a "cache" memory.
The memory component known as main memory is dynamically allocated to users, data, programs or processes. Main memory is typically a silicon-based memory such as a RAM. In many applications, dynamic random access memory (DRAM) is used as the main memory. The operating speed of CPU's often exceeds the speed at which DRAM can be accessed. For example, a CPU may operate at 100 MHz, with therefore a 10 ns cycle period. By contrast the DRAM main memory may have a standard access time of 60 ns. To improve the system operating speed small high speed secondary memory is used called cache memory. Cache memory is typically Static Random Access Memory (SRAM) with a typical access time of 15-25 ns. Because SRAM devices typically require larger device packages and are considerably more expensive than DRAM devices, cache memory is typically between ten and a thousand times smaller than the main memory. Cache memory combines the advantages of fast SRAMs with the low cost of DRAMs to maximize the efficiency of the memory system. Cache memory that is incorporated into the microprocessor chip is called a primary cache. A secondary cache is a cache memory that supports the main memory and is located outside of the microprocessor chip.
Where information is stored in a computer system generally depends on the amount of information to be stored, the importance of rapid access to the information, and the amount of time that the information is to be stored. The different storage resources in a computer system are each optimized to serve different storage functions. As discussed above, with regard to the access speed of the different memory resources, cache memory provides the fastest information access, followed by DRAM main memory, then a hard disk, and finally a floppy disk or an optical disk such as a CD ROM. With regard to storage capacity, hard disk drives generally provide the largest storage capacity. For example, a personal computer system may have a two gigabyte ("Gbyte") hard drive, a 1.4 megabyte ("Mbyte") floppy disk, a 32 Mbyte main memory RAM, a 16 kilobyte ("Kbyte") primary cache, and a 256 Kbyte secondary cache. An application program such as a word processor, may initially be stored on a hard disk. When the word processor program is executed the primary functional blocks of the program may be loaded into main memory so that the CPU can execute the primary functions of the program rapidly without having to access the hard disk.
A small subset of the program instructions stored in main memory are also stored in the cache memory to maximize the program operating speed. When the CPU reads code, it sends out the corresponding memory address. A cache controller is located between the CPU and main memory. The cache controller determines whether the requested code is available in the cache. When the requested code is found in the cache it is called a cache hit. When the requested code is not found in the cache it is called a cache miss. When the requested code is stored in the cache, the cache controller reads the code from the cache and passes it on to the CPU. The read access is intercepted by the cache and the main memory is not accessed.
When a cache miss occurs, the code is obtained from main memory. This slows down the operation of the program because of the slower access speed of main memory. Typically when a cache miss occurs, in addition to the requested bytes, the entire cache line which includes the requested bytes is read from the main memory into the cache. This is called a cache line fill. Cache lines are typically 16 or 32 bytes in size. One approach to loading code into the cache is to replace the cache line that has not been accessed for the longest time. Another approach to loading the cache is to randomly select the cache line to be replaced.
When an application program is loaded it is transferred from persistent storage, such as a hard drive, to main memory. Microprocessors with the Windows.RTM. 95 operating system use 32-bit physical addresses to address physical memory. Memory addresses that application programs use are called "virtual" addresses. Virtual addresses are translated to physical addresses through a "page table."
Memory shortages can occur if more memory is needed than the physical memory available. A paging system is used to, in effect, extend the amount of physical memory available on a system. A paging system can be used to free allocated memory thereby making it available for use by another memory requester. The contents of allocated memory is typically written to a storage medium (e.g., a hard disk drive). This process is referred to as "swapping out" the contents of memory. Once the contents of allocated memory are "swapped out" the memory can be freed for allocation to another. Contents that were "swapped out" can be reloaded into memory from storage as it is needed.
Each page of memory includes control bits. One bit indicates whether a particular page has been accessed; another if the page has been written to; a third bit indicates whether the page has been swapped out to disk and must be reloaded into memory. The operating system uses the control bits to determine whether a page can be swapped to a disk file to obtain more free physical memory. Windows.RTM. 95 uses a least-recently-used (LRU) algorithm to determine what pages to swap out to disk.
In the Windows.RTM. 95 operating system, application programs get loaded into memory in 4 Kbyte pages. The use of relatively small independent pages provides the multi-tasking advantages described above; however, the page system typically reduces the efficiency of the memory loading process. The pages are generally loaded in an essentially arbitrary order. Therefore, the pages are not read from disk in the most optimal way. Furthermore, the process of loading files from a hard disk is slow compared to accessing files stored in main memory. One primary factor that contributes to the relatively slow information retrieval speed for a hard disk is the time required to physically move the head relative to the disk to read the requested information. This disk seeking time is compounded for files that are comprised of sections stored at different locations on the disk. This is often the case because as a hard disk begins to fill up with files, the size of the free contiguous disk sections decrease. A new large file may therefore be stored in numerous small fragments. As files are repeatedly written to and deleted from the disk, the remaining free space on the disk becomes highly fragmented. Highly fragmented files combined with the already slow hard disk information retrieval speed can create significant delays in computer system operation.
A major problem that significantly reduces performance is that memory pages or disk sectors are loaded from the hard disk in the order in which they are executed, not the order in which they are stored on the hard disk. Thus, even if adequate contiguous storage is available on the hard disk and files are stored in consecutive order in a contiguous area of the hard disk, the files may be accessed in a different order, thereby requiring the read head of the hard disk to perform many seeks to read the data stored on the hard disk. Each seek typically requires the head to be moved to the track or cylinder in which the desired sector is located, to be precisely positioned over the track or cylinder, to locate the desired sector within the track or cylinder, to wait for the desired sector to pass by the read head, and to read the data stored in the desired sector. Thus, the loading of memory pages or disk sectors from the hard disk in an order different from the order in which they were stored significantly reduces system performance.
To reduce the delays caused by slow hard disk information retrieval speeds many operating systems include a software disk cache. The software disk cache is a section of main memory that is allocated to store information retrieved from the hard disk. The relationship between the software disk cache and the hard disk is conceptually similar to the relationship between the main memory and the secondary cache. In a personal computer system the software disk cache typically ranges in size from 512 Kbytes to 16 Mbytes. Generally the software disk cache stores the information most recently loaded from the hard disk. Some prior art cache loading systems load the software disk cache with information from the hard disk following a simple sequential access pattern. For operating systems that do not provide a software disk cache, programs, such as SmartDrive, can be purchased to add this feature. SmartDrive essentially stores executable code after a program has been launched. SmartDrive does not anticipate what code will be required by a program before the program is launched.
The prior art disk cache loading systems that load the software disk cache with information from the hard disk using a simple sequential access pattern are generally ineffective with files accessed in a non-contiguous manner because the software disk cache does not contain the non-contiguous segments when they are requested. Prior art software disk cache systems that merely retain the most recently loaded information from the hard disk often fail to contain a requested memory page the first time the page is requested. As a result significant delays due to slow hard disk information retrieval speeds persist in prior art software disk cache systems. Thus, an improved method of loading a software disk cache is needed to increase the software disk cache hit rate and, therefore, to reduce system delays due to slow hard disk information retrieval speeds.