This invention relates generally to data processing systems having a main memory and a high speed cache memory and is particularly directed to increasing the operating speed of a data processing system having main and cache memories by reducing the time required for main memory access when the required instruction is not in the cache memory.
A cache memory is an expensive, very high speed buffer memory which holds a copy of the most commonly (with respect to time) accessed program and data elements from a main memory in the data processing system. The cache memory provides a buffer storage for the portion of the main memory that is currently being used. It also contains a translation of the main memory address into this buffer storage. For this reason, cache memories are said to have "temporal" locality. This is a primary benefit of cache memories since previous data processing system studies have shown that operating programs generally exhibit a great deal of temporal locality. An example of a commonly used program data structure with temporal locality is a loop. Because of the temporal locality manner in which cache memories are used, a small amount of cache memory, relative to the size of the main memory, can be used to dramatically reduce the average access time from the data processing system's central processor unit (CPU) to the main memory while only slightly impacting system cost.
When the desired data is not found in the cache memory, the CPU must then access the main memory to retrieve the required data and store this current data within the cache memory as an update to maintain the desired temporal locality. This operation requires accessing the main memory which typically operates at a slow characteristic speed. For example, while the CPU may be capable of accessing the cache memory in 35-70 nanoseconds, dynamic RAM access times are generally 70-120 nanoseconds. This problem is particularly evident at such critical operating times as power-up when the cache memory is "empty" or when it is necessary to switch from one task to another during operation of the data processing system. At such critical times, the contents of the cache memory is not relevant to the current program accesses and thus it is possible that a large number of slow main memory accesses will be necessary until the cache memory is sufficiently updated. Since the cache memory must operate at the slower main memory speeds during these updates, its high speed effectiveness is essentially lost during these periods.
One prior art attempt to solve this problem which has been used in mainframe computers is to increase the block size of the cache memory. The block size is the number of bytes actually read into the cache memory during the update process. Since this update of the cache memory is generally longer than the current access requirement, the cache memory will be updated with bytes which most likely will be accessed in the near future due to the operating characteristic of spatial locality exhibited by most computer programs. This characteristic relates to the tendency to execute instructions which are stored in physically or spatially close together addresses as is the case in most main memories. However, this approach has some fundamental limitations. For example, as the block size increases the size of the bus between the cache memory and the main memory must proportionately increase if the update is to occur in a single main memory access. This limitation is primarily in the physical space required by additional buses as well as in the increased system cost for the buses, buffers and additional control logic. Another fundamental limitation of this approach, which is referred to as "fragmentation", relates to attempts to continue to increase the data block size for an increased cache memory update in that spatial locality will increase performance of the cache memory only up to a point. After that point is reached, additional increases in the block size will actually result in a decrease in cache memory performance because unneeded code will be read and stored in the cache memory during an update. This unneeded code update will displace code already residing within the cache memory which will be needed for an access in the near future.
A problem is encountered when the desired data is not found in the cache memory, necessitating accessing of the main memory. In past designs, this delayed access to the main memory has placed a limitation on system operating speed because the cache memory must first be accessed followed by a determination that the required data or instructions are not in the cache before the main memory is accessed. Thus, cache memory misses actually result in an access time of greater duration than if only the main memory had initially been accessed. This "longer" memory access occurs only when the required data or instructions is not in the cache memory which is on the order of 10% to 30% of all memory accesses. Thus, in prior art memory accessing schemes, although the use of a cache memory reduces the average time for memory access, actual access to the slower main memory has been relatively slow and certainly less than optimal.
The present invention addresses and overcomes the aforementioned limitations of the prior art by simultaneously accessing the cache and main memories during a memory read cycle. Cache memory and main memory dynamic RAM access times are matched in such a way that very fast cache memory accesses occur during the time that the dynamic RAM would be unable to start an access, e.g., during its precharge time. Memory access time is thus more optimally employed, since cache memory accesses occur more transparently relative to the main memory.