A typical digital data processing system comprises a number of basic units including a central processing unit (CPU), a memory unit (main memory), and an input/out (I/0) unit. The main memory stores information, i.e. data and instructions for processing the data, in addressable storage locations. The information is transferred between the main memory and the CPU along a bus consisting of control lines, address lines and data lines. Control signals specify the direction of transfer. For example, the CPU issues a read request signal to transfer data and instructions over the bus from an addressed location in the main memory to the CPU. The CPU then processes the retrieved data in accordance with the instructions obtained from the memory. The CPU thereafter issues a write request signal to store the results in an addressed location in the main memory.
The information transferred between the CPU and main memory must conform to certain timing relationships between the request signals and the information on the bus. Access time is defined as the time interval between the instant at which the main memory receives a request signal from the CPU and the instant at which the information is available for use by the CPU. If the CPU operates at a fast rate and the access time of the main memory is slow as compared to the CPU rate, the CPU must enter a wait state until the request to memory is completed, thereby adversely affecting the processing rate of the CPU. This problem is particularly critical when the memory request is a read request, since the CPU is unable to operate, that is, process data, without the requested information.
A high-speed cache memory is used in these situations. The cache memory access speed is closer to the operational speed of the CPU and thus, use of the cache memory increases the speed of data processing by providing information to the CPU at a rapid rate. The cache memory operates in accordance with the "principle of locality"; that is, if a memory location is addressed by the CPU, it will probably be addressed again soon and nearby memory locations also will tend to be addressed soon. When the CPU requires information, the cache memory is examined first. If the information is not located in the cache, the main memory is accessed. A "block mode" read request is then issued by the CPU to transfer a block of information including both the required information and information from nearby memory locations, from the main memory to the cache memory.
Typically, a random access main memory is logically organized as a matrix of storage locations, the address of each location thus comprising a first set of bits identifying the row of the location and a second set of bits identifying the column. Large memories may further be organized in boards and banks of boards. Thus, an address in such a memory may comprise fields identifying banks, boards and, within each board, rows and columns.
A cache memory is faster than main memory and, because of its higher cost, smaller. A cache memory holds a number of blocks of information, with each block containing information from one or more contiguous main memory locations. Each block is identified by a cache address. The cache address includes memory address bits that identify the corresponding memory locations. These bits are collectively called the "index" field. In addition to information from main memory, each block also contains the remainder of the memory address bits identifying the specific location in main memory from which the information in the cache block was obtained. These latter bits are collectively called a "tag" field. When the CPU requires information, the index field and tag field in the cache are examined to determine whether a block contains the requested information.
When the address in the cache does not match the address in the main memory specified by the CPU (i.e., a "cache miss") and the information presently stored in the addressed cache location has previously been changed from the information stored in the corresponding location in main memory (i.e., the cache block is "dirty"), the CPU issues a write request to transfer the cache block to main memory before issuing a read request to acquire a new block of information from the main memory. That is, the dirty cache block is written back to main memory to make room in the cache for the information that the CPU is currently requesting. If the write request is performed first then the performance of the CPU will suffer since it is stalled waiting for the required information.
Previously, a method or protocol for exchanging a block of information in a cache memory with another block of information in a main memory involved the CPU issuing a read request to the main memory, transferring the dirty block from cache memory to a temporary holding register, transferring the new block of information from the main memory when available to the cache memory over the system bus, then issuing a write request to main memory to store the contents of the temporary register in main memory. Such a complex protocol increases the time required per unit of information transferred over the system bus, i.e. decreases system bus bandwidth, since two requests are required to exchange the blocks of information.
Therefore, an object of the present invention is to provide a simplified protocol that increases system bus bandwidth when exchanging blocks of information between a cache and main memory, thereby increasing overall data processing system throughput.
Additionally, a feature of the present invention is to provide a cache memory exchange protocol that reduces CPU access time or latency for information obtained from main memory during an exchange between a block of information in a cache memory and a block of information in a main memory.
In accordance with another aspect of the invention, a feature is to provide an exchange request command that reduces the time required by the main memory to both retrieve a block of information designated by the CPU and store a dirty cache block of information.