1. Field of the Invention
This invention relates to electronic digital data processing systems and, more particularly, to electronic digital data processing systems which include a cache memory as well as a main memory.
2. Description of Related Art
Improvements in data processing systems have generally been directed at the reduction of either the average time required to execute a given instruction or the cost of the equipment required to execute such an instruction. One design tradeoff which has typically been made is that of cost versus speed for units of memory for the storage of data. For example, tape memory is traditionally slower and less expensive than disk memory. Disk memory in turn is available in several types with the selection of any one type over another involving a cost/speed tradeoff. Disk memory is slower but less expensive than solid-state memory which itself is available in several types, the selection of which again involves a cost/speed tradeoff. Thus, it continues to be a need of the art to provide cheaper, faster memories or, failing that, to improve the efficiency of presently existing memory types. The present invention relates to an improvement of the second type. In particular, the invention involves apparatus and methods of operation for reducing the average time necessary for a host central processing unit (CPU) having an associated cache memory and a main memory to obtain stored data from either memory.
By way of background, it should be appreciated that computer systems are generally provided with more than one type of memory. Recognizing that the cost of a single fast memory would be prohibitive, computer designers have henceforth employed a variety of devices to hold data and instructions, the repository for each piece of information being selected based upon how urgently the information might be needed by the CPU. That is, in general, fast but expensive memories are used to store information the CPU might need immediately, and slower but less expensive devices are used to retain information for future use.
A multitude of memory and storage devices have heretofore been used in computer systems. Long-term storage is generally effected using disk and tape storage. Disk and tape implemented data storage are presently the slowest of all of the memory and storage devices in common use, and they are generally used to hold data and programs that are not in actual use by the processor. Moving information stored on disks and tape into the main memory requires a relatively long period of time, but this slowness is tolerable since the movement of data and instructions from disk and tape storage are infrequent and can be done without the full attention of the CPU.
Another memory device is a read-only memory or ROM. A ROM, with typical access times between 50 and 200 nanoseconds, retains its contents when the computer is turned off. The ROM memory typically holds start-up programs that prepare the machine for use.
Another memory device, most commonly used for a system main memory, is the RAM memory which is employed for storage of data and program instructions brought from disk or tape for immediate use by the CPU. The main memory usually comprises a number of dynamic RAM ("DRAM") chips. The processor can retrieve the contents of these DRAMs in about 100 nanoseconds, placing this type of memory alongside ROM in speed.
Yet another type of memory device is cache memory. Cache memory usually comprises a number of static RAM ("SRAM") chips. Cache memory is up to ten times faster than main memory and is designed to hold the operating instructions and data most likely to be needed next by the CPU, thereby speeding computer operation.
Finally, small amounts of memory within the CPU are called CPU memory or registers. Made of static RAM circuits optimized for speed, data registers within the processors are the fastest memory of all. A program register stores the location in memory of the next program instruction while an instruction register holds the instruction being executed and a general purpose register briefly stores data during processing.
Based upon the foregoing, it should be appreciated that it is known to those skilled in the art to include a cache memory configuration in a computer system to provide a place for fast local storage of frequently accessed data. A cache system intercepts each one of the microprocessor memory references to see if the address of the required data resides in the cache. If the data does reside in the cache (a "hit"), it is immediately returned to the microprocessor without the incurring wait states necessary to access main system memory. If the data does not reside in the cache (a "miss"), the memory address reference is forwarded to the main memory controller and the data is retrieved from main memory. Since cache hits are serviced locally, a processor operating out of its local cache memory has a much lower "bus utilization", which reduces system bus bandwidth requirements, making more bus bandwidth available to other bus masters. This is significant because, as is well known to those skilled in the art, the bus in the computer; that is, the communications channel between the CPU and the system's memory and storage devices is a principal bottleneck. Virtually all instructions and all data to be processed must travel this route at least once. To maximize system performance, it is essential that the bus be used efficiently.
As should be fully appreciated by those skilled in the art, the addition of a cache controller into a computer system is structured so as to separate the microprocessor bus into two distinct buses: the actual microprocessor bus and the cache controller local bus. The cache controller local bus is designed to look like the front end of a microprocessor by providing a cache controller local bus equivalent to all appropriate microprocessor signals. The system interconnects to this "micro-processor like" front end just as it would to an actual microprocessor. The microprocessor simply sees a fast system bus, and the system sees a microprocessor front end with a low bus bandwidth requirement. The cache subsystem is transparent to both. Transparency, in the data communications field, refers to the capability of a communications medium to pass, within specified limits, a range of signals having one or more defined properties. It should be noted that in such systems the cache controller local bus is not simply a buffered version of the microprocessor bus, but rather, is distinct from, and able to operate in parallel with, the microprocessor bus. Thus, other bus masters, that is, supervisory systems of one kind or another residing on either the cache controller local bus or the system bus, are free to manage the other system resources while the microprocessor operates out of its cache.
As previously stated, a cache memory system intercepts memory references and forwards them to system memory only if they "miss" in the cache. Many prior art U.S. patents are directed to various aspects of cache memories and methods of accessing memories which include a cache memory section including: U.S. Pat. No. 4,794,521 to Ziegler et al., U.S. Pat. No. 4,646,233 to Weatherford et al., U.S. Pat. No. 4,780,808 to Moreno et al., U.S. Pat. No. 4,783,736 to Ziegler et al., U.S. Pat. No. 4,195,342 to Joyce et al., U.S. Pat. No. 4,370,710 to Kroft, U.S. Pat. No. 4,476,526 to Dodd, U.S. Pat. No. 4,070,706 to Scheuneman, U.S. Pat. No. 4,669,043 to Kaplinsky, U.S. Pat. No. 4,811,203 to Hamstra, U.S. Pat. No. 4,785,398 to Joyce et al., U.S. Pat. No. 4,189,770 to Gannon et al., and U.S. Pat. No. 3,896,419 to Lange et al. The latter patent, U.S. Pat. No. 3,896,419 to Lange et al., entitled "Cache Memory Store in a Processor of a Data Processing System" discusses the "parallel" operation of a cache store and other requests for data information from the main memory. The patent specifically teaches, however, checking the cache store while signals are "readied" for the backup memory store. Further, Lange et al. specifically teach making the cache directory, the cache store, and the control logic therefor part of the central processor. With this type of structure, cache checking is completed before the regular main memory cycle is started, so that if a "hit" is made in the cache, the main memory cycle never leaves the processor. This type of system is wholly different from systems in which a main memory access signal is actually sent out on a bus "in parallel" with a cache memory access signal.
Based upon the foregoing, it should be appreciated that in computer systems heretofore constructed which include a cache, when a memory reference occurs the access is "looked up" in the cache, and only if the reference is not found (or "misses") in the cache is it turned into a signal sent out on a bus to system memory. This causes at least two problems. First, cache misses incur a cache look-up latency, and so take at least an extra clock period to go through an access cycle over a system without a cache. Moreover, it should be noted that with a poor hit rate a system with a cache could run slower than a system without a cache. Second, cache controller complexity and pin requirements are increased because the cache controller has to recreate a processor bus for the memory controller. Such complexity causes slower operation since bus re-creation also adds latency to the memory access on cache misses.