Microprocessor-based computer systems typically include a microprocessor, a memory subsystem, and system logic, intercoupled by a local (system) bus. The microprocessor may include an internal L1 (level one) cache memory that stores frequently accessed data on-board the microprocessor chip. In this manner, if requested data resides in the L1 cache, the microprocessor can access it without running an external bus cycle.
The memory subsystem can include both external system Dynamic Random Access Memory (DRAM) memory, along with an external L2 (level two) cache. Together, the external system memory and the L2 cache form a memory hierarchy.
The system logic includes a memory/bus controller that, together with the microprocessor, implements a bus protocol for transferring data between the microprocessor and the memory subsystem. If a central processing unit (CPU) requests to access a piece of data that is absent from the cache, then that CPU access "misses" the cache. If the CPU access (read or write) misses in the internal L1 cache, the microprocessor runs an external bus cycle to access the memory subsystem. The access will be serviced by the L2 cache or, if that access misses, the system DRAM memory.
A computer system based upon the Intel.RTM. 586 or Pentium.TM. microprocessor uses 64-bit internal and external data buses able to transfer eight bytes (two doublewords or one quadword) at a time. The internal L1 cache consists of four lines, with each line containing one quadword or eight bytes of data, such that cache line fills (reads) and replacements (writes) require four 64-bit (quadword) transfers between the microprocessor and the memory subsystem (L2 cache or system DRAM).
According to the conventional 586 bus architecture and protocol, external bus cycle transfers between the microprocessor and the memory subsystem occur in either burst or non-burst mode. Burst mode bus cycles transfer in sequence the four quadwords of an L1 cache-line fill, replacements, or snoop write-backs in response to cache inquiries during DMA (direct memory access) operations. In addition, some microprocessors support write gathering in which writes to the contiguous bytes of a cache line are gathered in internal write buffers and then written out to the memory subsystem in burst mode. Non-burst mode bus cycles are used to transfer (read/write) one to eight bytes at a time in a single bus transfer.
The 586 bus architecture supports pipelined bus cycles. The bus cycle control signal NA# (next address) is driven by the system during a current bus cycle, before the last BRDY# (burst ready) signal has been returned, to request that the microprocessor assert address/control signals for the next pending bus cycle request, designated a pipeline bus cycle. NA# is ignored if there is no pending bus cycle request, or if either the current or next bus cycle is a line replacement or snoop write-back cycle.
Whether an external bus cycle is a burst or non-burst transfer is determined by the microprocessor CACHE# and W/R# bus cycle definition signals, and the system KEN# (cache enable) signal. If CACHE# is asserted for a read cycle, and the system returns KEN#, then the read is converted to a burst fill cycle. Asserting CACHE# for a write cycle indicates a cache line replacement or snoop write-back (or, possibly, a gathered write).
Under the current Pentium.TM. or 586 bus protocol, burst mode transfers are limited to cacheable addresses. The protocol does not support burst mode transfers to non-cacheable addresses, such as those assigned to memory-mapped input/output (or I/O) devices. When the Pentium.TM. protocol was defined, this burst mode limitation was insignificant because such memory-mapped I/O devices were too slow to benefit from burst mode transfers. The speed of those devices has now increased, however, so that they would now benefit from burst mode transfers. Accordingly, there now exists a need in the art for a bus protocol that supports burst mode transfers to both cacheable and non-cacheable addresses.