1. Field of the Invention
This invention relates to more efficient execution of software in computer systems. More particularly, the present invention relates to computer systems for executing software stored in cache memory subsystems. Still more particularly, the present invention relates to cache subsystems that load real-time event processing software for more efficient execution.
2. Description of the Relevant Art
Software to be executed by a microprocessor typically is stored on a floppy or fixed disk medium. Once a request is made by a user to execute a program, the program is loaded into the computer's system memory which usually comprises dynamic random access memory devices (DRAM). The processor then executes the code by fetching an instruction from system memory, receiving the instruction over a system bus, performing the function dictated by the instruction, fetching the next instruction, and so on.
Generally, whenever system memory is accessed, there is a potential for delay between the time the request to memory is made (either to read or write data) and the time when the memory access is completed. This delay is referred to as "latency" and can limit the performance of the computer.
There are many sources of latency. For example, operational constraints with respect to DRAM devices cause latency. Specifically, the speed of memory circuits is based upon two timing parameters. The first parameter is memory access time, which is the minimum time required by the memory circuit to set up a memory address and produce or capture data on or from the data bus. The second parameter is memory cycle time, which is the minimum time required between two consecutive accesses to a memory circuit. For DRAM circuits, the cycle time typically is approximately twice the access time. DRAM circuits today generally have access times in the approximate range of 60-100 nanoseconds, with cycle times of 120-200 nanoseconds. The extra time required for consecutive memory accesses in a DRAM circuit is necessary because the internal memory circuits require additional time to recharge (or "precharge") to accurately produce data signals. Thus, even a processor running as slow as 10 MHz cannot execute two memory accesses in immediate succession (i.e., with adjacent clock pulses) to the same 100 nanosecond DRAM chip, despite the fact that a clock pulse in such a microprocessor is generated every 100 nanoseconds. A DRAM chip requires time to stabilize before the next address in that chip can be accessed. Consequently, in such a situation the processor must wait by executing one or more loop cycles before it can again access data in the DRAM circuit. Typically, a memory controller unit ("MCU") is provided as part of the computer system to regulate accesses to the DRAM main memory. Latency caused by long memory cycle times relative to processor speeds has become a particularly acute problem today as processor speeds in excess of 100 MHz are commonplace. Instead of waiting one or two clock cycles to again access a 100 nanosecond DRAM device, today's "486" and "Pentium" processors must wait 20 or more clock cycles.
In addition to the delays caused by access and cycle times, DRAM circuits also require periodic refresh cycles to protect the integrity of the stored data. These cycles consume approximately 5 to 10% of the time available for memory accesses, and typically are required approximately every 4 milliseconds. If the DRAM circuit is not refreshed periodically, the data stored in the DRAM circuit will be lost. Thus, memory accesses may be halted while a refresh cycle is performed.
Further, most, if not all, computer architectures today include multiple bus master systems. Any one of a number of bus masters may obtain ownership or control of the system bus and thereby access system memory. Normally, granting a bus master device ownership of the system bus, from among competing requests for ownership, is based on a predetermined hierarchy. In a hierarchy scheme, one bus master device may have a higher position in the hierarchy than another bus master device. Accordingly, the former device would be granted ownership of the system bus if there was a conflict between the two devices where each device contemporaneously sought control of the bus. Although hierarchy schemes are valuable for resolving conflicts between multiple bus master devices requesting control of the bus to access system memory, such schemes force a bus master that must yield to a higher priority bus master to wait while the other device executes its memory transaction, thereby causing latency with respect to the waiting device.
The latency associated with memory accesses may be different and unpredictable from one memory access to the next. For many software applications unpredictable latency is not a significant problem. However, for core sequences, especially those related to real-time event processing such as music synthesis which implement digital signal processing, unpredictable latency can greatly interfere with proper performance and produce undesirable results.
To expedite memory transfers, most computer systems today incorporate cache memory subsystems. Cache memory is a high-speed memory unit interposed between a slower system DRAM memory and a processor. Cache memory devices usually have speeds comparable to the speed of the processor and are much faster than system DRAM memory. The cache concept anticipates the likely reuse by the microprocessor of selected data in system memory by storing a copy of the selected data in the cache memory. When a read request is initiated by the processor for data, a cache controller determines whether the requested information resides in the cache memory. If the information is not in the cache, then the system memory is accessed for the data and a copy of the data may be written to the cache for possible subsequent use. If, however, the information resides in the cache, it is retrieved from the cache and given to the processor. Retrieving data from cache advantageously is faster than retrieving data from system memory, involving both less latency and more predictable latency.
Code, as well as data, is subject to being stored in cache. Cache memory size, however, is generally much smaller than system memory and is used only to store the most recently used data or code anticipating the reuse of that information. Because the cache is relatively small and is used only for storing the most recently accessed code or data, old code or data (i.e., less recently used code or data in cache) is at risk of being overwritten by new code or data. Although replacement generally causes no problem for many types of data and code, replacement of real-time code can detrimentally affect the predictability of the latency of accesses to the real-time code or data and thus may cause improper or poor multimedia performance.