1. Technical Field
The invention relates generally to information processing system organizations (Class 395), and more particularly to system interconnections, I/O processing, and storage accessing and control (subclasses 325, 275, and 425).
In even greater particularity, the invention relates to the burst ordering scheme for burst transfers of data from a memory subsystem to a processor. In an exemplary embodiment, the invention is used in a computer system based on a 32-bit 486-class microprocessor/bus architecture in which a cacheable read results in a burst transfer of an entire cache line of 4 Dwords (32-bit doublewords) in four successive bus cycles.
2. Related Art
Computer systems based on the 486-class microprocessor ("486 computer systems") typically use a burst mode bus architecture. Burst transfers are used to transfer multiple blocks of data (such as multiple Dwords) from a memory subsystem to a processor over an external bus with a width less than the total size of the transfer (such as a single Dword)--the multiple blocks are transferred in a burst sequence using successive bus is cycles.
Without limiting the scope of the invention, this background information is provided in the context of a specific problem to which the invention has application: in a 486 computer system, implementing a burst transfer protocol that uses a burst order (sequence) of ascending or descending (modulo 4) addresses only, while maintaining compatibility with the conventional 486 bus architecture and in particular the 486 burst ordering scheme. The 486-class microprocessor processes data in 32-bit Dwords (4-bytes), and includes an internal cache organized in cache lines of 4-Dwords (16 byte). Reads that miss in the cache result in the transfer of the entire cache line that includes the requested Dword. This cache line fill is accomplished in a burst transfer of all 4 Dwords of the cache line initiated by a single address strobe--the 4 Dwords are transferred in a predetermined order (sequence) in successive bus cycles.
Performance considerations make it desirable to begin a burst transfer with the specific Dword address requested by the microprocessor's CPU core. This "requested" or "critical" Dword can then be supplied directly to the CPU core without continuing to stall the core until the burst transfer is complete (i.e., the transfer of the other three Dwords of the cache line).
The burst ordering scheme commonly used in 486 computer systems (the "486 burst order") is described in two U.S. patents granted to Intel Corporation: U.S. Pat. Nos. 5,131,083 and 5,255,378 both titled "Method of Transferring Burst Data In A Microprocessor". The burst ordering scheme described uses a different Dword sequence--ascending or descending (modulo 4)--depending on the position of the critical Dword in a memory aligned cache line.
In a memory aligned organization, the Dwords of a burst (cache line) are always within the same 16-byte (4 Dword) aligned memory block. Thus, address bits 31-4 are the same for each of the Dwords--address bits 1 and 0 are used for byte addressing within a Dword, while bits 3 and 2 determine Dword addressing within a burst. Address bits 3 and 2 are designated A&lt;3&gt; and A&lt;2&gt;, respectively, or A&lt;3:2&gt; collectively. The four Dwords A&lt;3:2&gt; are designated 0-1-2-3 (or, alternatively, 0-4-8-C hex).
Table 1 shows the 486 burst ordering for each of the 4 possible requested addresses of a 4 Dword cache line. Note that, if A&lt;2&gt; is 0, then the burst order is ascending (modulo 4), while if A&lt;2&gt; is 1, then the burst order is descending (modulo 4).
TABLE 1 ______________________________________ A&lt;3:2&gt; A&lt;2&gt; Burst Order Burst Direction ______________________________________ 0 0 0-1-2-3 Ascending 1 1 1-0-3-2 Descending 2 0 2-3-0-1 Ascending 3 1 3-2-1-0 Descending ______________________________________
A common alternative to the 486 burst ordering scheme is linear wrap. In this ordering scheme, all sequences are either ascending or descending (modulo), regardless of the requested address. Thus, for a 4 Dword burst transfer, an ascending linear wrap burst order sequence would be: 0-1-2-3, 1-2-3-0, 2-3-0-1, and 3-0-1-2.
Using a linear wrap burst ordering scheme in a 486 computer system is problematic because many chipset vendors have taken into account the 486 burst ordering scheme. Thus, to increase burst transfer speeds, the memory subsystem logic may, in response to a burst transfer initiated by a requested address (the critical Dword address), calculate the remaining three addresses according to the 486 burst order sequence and return the data in that order, ignoring any subsequent addresses output by the microprocessor during the burst transfer. Thus, attempting to implement a 486-class microprocessor with a linear wrap burst order would result in an incompatibility with some existing chipset logic.
An alternative scheme for providing linear wrap burst ordering in a 486 computer system would be to cause the microprocessor to initiate a burst transfer using only requested addresses A&lt;3:2&gt;0 or 2 (ascending) or A&lt;3:2&gt;1 or 3 (descending). See, Table 1. Thus, for an ascending-only burst ordering, the microprocessor would initiate a burst transfer for requested addresses A&lt;3:2&gt;0 or 1 with address A&lt;3:2&gt;0, and for requested addresses A&lt;3:2&gt;2 or 3 with address A&lt;3:2:&gt;2 (i.e., burst transfers would not be initiated using addresses A&lt;3:2&gt;1 or 3, even if those were the requested addresses).
This scheme would maintain compatibility, but might cause some performance degradation because the critical Dword would not always be returned first. Specifically, for the case of an ascending-only burst ordering, if the requested addresses were A&lt;3:2&gt;1 or 3, the burst transfer would be initiated with addresses A&lt;3:2&gt;0 or 2, with the requested addresses being returned in the second bus cycle of the burst transfer (i.e., 0-1-2-3 and 2-3-0-1).
An addition problem with this alternative approach arises when referencing memory-mapped I/O locations. For certain applications, such as video applications employing destructive reads (where the read causes the memory location to be modified), these I/O locations must be accessed precisely in the order the core requests them. While these I/O locations are non-cacheable, only the external system logic knows that they are non-cacheable. For each request by the microprocessor for a burst transfers, the system logic uses a cache enable (KEN#) signal to notify the microprocessor whether requested data being returned is cacheable--KEN# is returned by the system logic prior to the completion of the transfer of the requested data (i.e., the requested critical Dword that always begins a conventional 486 burst transfer). If in fact the requested data is non-cacheable, the system logic deasserts KEN#, instructing the microprocessor to convert the burst transfer into a single transfer (non-burst) cycle. This signaling protocol requires that the requested address be the first one presented to the system logic--otherwise an extraneous (destructive) read may corrupt a memory-mapped environment.
Thus, implementing a 486-class microprocessor that supports linear wrap burst ordering may result in incompatibility with certain existing chipsets that implement only the 486 burst ordering scheme. Attempting to implement linear wrap by using only the ascending (or descending) sequences of the 486 burst ordering scheme may adversely affect performance, and may cause memory corruption in a memory-mapped environment.