1. Field of the Invention
The present invention generally relates to computer systems which include components that are subject to cycles in which data is read, modified and written back by a central processing unit (CPU) or other system device. Still more particularly, the present invention relates to a computer system implementation in which non-cacheable data or data in block-oriented devices can be selectively read in relatively large blocks and temporarily stored in a stream read buffer during a special mode of operation of the CPU.
2. Description of the Relevant Art
For most computer systems, the number of clock cycles required for a data access to a memory device depends upon the component accessing the memory and the speed of the memory unit. Most of the memory devices in a computer system are slow compared to the clock speed of the central processing unit (CPU). As a result, the CPU is forced to enter wait states when seeking data from the slower memory devices. Because of the relative slowness of most memory devices, the efficiency of the CPU can be severely compromised. As the operating speed of processors increases and as new generations of processors evolve, it is advantageous to minimize wait states in memory transactions to fully exploit the capabilities of these new processors.
In an effort to reduce wait states, it has become commonplace to include one or more cache memory devices in a computer system. A cache memory is a high-speed memory unit interposed in the memory hierarchy of a computer system generally between a slower system memory (and/or external memory) and a processor to improve effective memory transfer rates and accordingly improve system performance. The cache memory unit is essentially hidden and appears transparent to the user, who is aware only of a larger system memory. The cache memory usually is implemented by semiconductor memory devices having access times that are comparable to the clock frequency of the processor, while the system and other external memories are implemented using less costly, lower-speed technology.
The cache concept is based on the locality principle, which anticipates that the microprocessor will tend to repeatedly access the same group of memory locations. To minimize access times of this frequently used data, it is stored in the cache memory, which has much faster access times than system memory. Accordingly, the cache memory may contain, at any point in time, copies of information from both external and system memories. If the data is stored in cache memory, the microprocessor will access the data from the cache memory and not the system or external memory. Because of the cache memory's superior speed relative to external or system memory, overall computer performance may be significantly enhanced through the use of a cache memory.
A cache memory typically includes a plurality of memory sections, wherein each memory section stores a block or a "line," of two or more words of data. A line may consist, for example, of four "doublewords" (wherein each doubleword comprises four 8-bit bytes). Each cache line has associated with it an address tag that uniquely associates the cache line to a line of system memory.
According to normal convention, when the processor initiates a read cycle to obtain data or instructions from the system or external memory, an address tag comparison first is performed to determine whether a copy of the requested information resides in the cache memory. If present, the data is used directly from the cache. This event is referred to as a cache read "hit." If not present in the cache, a line in memory containing the requested word is retrieved from system memory and stored in the cache memory. The requested word is simultaneously supplied to the processor. This event is referred to as a cache read "miss."
In addition to using a cache memory during data retrieval, the processor may also write data directly to the cache memory instead of to the system or external memory. When the processor desires to write data to memory, an address tag comparison is made to determine whether the line into which data is to be written resides in the cache memory. If the line is present in the cache memory, the data is written directly into the line in cache. This event is referred to as a cache write "hit." A data "dirty bit" for the line is then set in an associated status bit (or bits). The dirty status bit indicates that data stored within the line is dirty (i.e., modified), and thus, before the line is deleted from the cache memory or overwritten, the modified data must be written into system or external memory. This procedure for cache memory operation is commonly referred to as "copy back" or "write back" operation. During a write transaction, if the line into which data is to be written does not exist in the cache memory, the data typically is written directly into the system memory. This event is referred to as a cache write "miss".
While cache memory devices have proven effective in reducing latency times in processors, there are certain memory devices which contain data that cannot be cached in a cache memory. Video and graphics cards are examples of devices that contain data that typically is not cacheable. CPU accesses to memory devices which contain non-cacheable data thus tend to be inefficient because the data cannot be stored in cache memory, but instead must be directly accessed from the slower memory devices. Thus, despite the fact that cache memories do improve system efficiency and reduce CPU latency, there are a number of components in computer systems which are being accessed in an inefficient manner because the data stored in these devices is non-cacheable.