1. Field of the Invention
This invention relates to a computer and, more particularly, to a bus interface unit which allows a central processing unit ("processor") to read burst of data from a device coupled to a peripheral bus and, more particularly, to read the data provided in sequential address order from the device into the processor in toggle mode order.
2. Description of the Related Art
Modern computers are called upon to execute instructions and transfer data at increasingly higher rates. Many computers employ CPUs which operate at clocking rates exceeding several hundred MHz, and further have multiple busses connected between the CPUs and numerous input/output devices. The busses may have dissimilar protocols depending on which devices they link. For example, a CPU local bus connected directly to the CPU preferably transfers data at a faster rate than a peripheral bus connected to slower input/output devices. A mezzanine bus may be used to connect devices arranged between the CPU local bus and the peripheral bus. The peripheral bus can be classified as, for example, an industry standard architecture ("ISA") bus, an enhanced ISA ("EISA") bus or a microchannel bus. The mezzanine bus can be classified as, for example, a peripheral component interface ("PCI") bus to which higher speed input/output devices can be connected.
Coupled between the various busses are bus interface units. According to somewhat known terminology, the bus interface unit coupled between the CPU bus and the PCI bus is often termed the "north bridge". Similarly, the bus interface unit between the PCI bus and the peripheral bus is often termed the "south bridge".
The north bridge, henceforth termed a bus interface unit, serves to link specific busses within the hierarchical bus architecture. Preferably, the bus interface unit couples data, address and control signals forwarded between the CPU local bus, the PCI bus and the memory bus. Accordingly, the bus interface unit may include various buffers and/or controllers situated at the interface of each bus linked by the interface unit. In addition, the bus interface unit may receive data from a dedicated graphics bus, and therefore may include an advanced graphics port ("AGP"). As a host device, the bus interface unit may be called upon to support both the PCI portion of the AGP (or graphics-dedicated transfers associated with PCI, henceforth is referred to as a graphics component interface, or "GCI"), as well as AGP extensions to the PCI protocol.
Mastership of the various busses is preferably orchestrated by an arbiter within the bus interface unit. For example, if the CPU (or processor) coupled to the local CPU bus wishes to read data from a peripheral device coupled to the peripheral bus, it must solicit mastership of the peripheral bus before doing so. Once mastership is granted, the processor can then read the appropriate data from the peripheral device (preferably an input/output device) to temporary storage devices or "queues" within the bus interface unit.
Typically, data is arranged within the peripheral device, system memory and/or cache locations within the processor according to cache lines. A read operation from a peripheral device to the processor assumes that at least a portion if not the entire cache line is involved in the read transaction. To transfer an entire cache line, several clock cycles may be needed. For example, a cache line may contain four quad words and each read cycle can transfer one quad word or eight bytes across a 64-bit memory bus.
A particular byte within the cache line can therefore be addressed by several bits. The least significant three bits can be used to determine a particular offset within each quad word, and the next two least significant bits are used to determine which quad word is being addressed within the cache line.
In many instances in which a processor requests data from a peripheral device, the first address dispatched to the peripheral device designates either the first, second, third or fourth quad word within a particular cache line. Thus, it is said that the initial address is not constrained to a cache line boundary. In fact, most modern processors extract quad words from a cache line based on a particular addressing mode known as the "toggle mode".
Toggle mode addressing of the cache line is generally known as a specific order by which data is read into the processor. Toggle mode addressing depends on which quad word is first addressed. The first-addressed quad word is often deemed the "target" quad word. Toggle mode addressing can be thought of as dividing a cache line in half, wherein the next successive quad word is dependent on where in the cache line the target quad word resides. For example, if a target quad word resides at hexadecimal address location 08 (or 01000 binary), then the target quad word will be read first, followed by quad word at address 00 to complete the ordering of the first half of the cache line being read. The second half of the cache line is read identical to the first half. That is, the quad word at hexadecimal address location 18 will be read before address location 10.
The mechanism of toggle mode addressing from an initial target address until the entire cache line is transferred is generally well known as a conventional microprocessor addressing scheme. Unfortunately, a peripheral device connected to the PCI bus or the dedicated graphics bus (e.g., AGP) wants to send and receive bursts of data in sequential addressing order (i.e., data residing at addresses having numerically increasing values). In particular, a peripheral device contains a cache line of data accessible by an initial address representing the smallest addressing value of that cache line. The target address may access a quad word somewhere within that cache line and not necessarily the same address as the initial address. To receive a burst of data from the peripheral device into the processor, it is advantageous to retrieve the data from the peripheral device in sequential addressing order. However, the sequentially increasing addresses are not recognizable to a processor requesting data in toggle mode order. Thus, the target address for the cache line to be read by the processor must somehow be modified so that a sequential order of addresses, beginning with the initial address, can be sent. The peripheral device could then more efficiently burst data at those address locations back toward the processor. The benefit in bursting data address in sequential order becomes apparent when dealing with the peripheral bus protocol.
Typical accesses to a peripheral device requires arbitration of the peripheral bus. Once mastership is gained, the processor can then address data residing within the peripheral device. Thereafter, the data can be returned to the processor. If the target address is not the initial address (i.e., lowest address in a numerically increasing sequence of addresses) of the cache line, then only the target data of one quad word can be transferred at a time. Extracting the next quad word within that cache line in toggle mode order requires the same sequence of steps used to transfer the target data. The cycles involving arbitration, address, data, and turn-around must therefore be repeated four times for the four quad words within each cache line. This would therefore involve at least sixteen peripheral bus cycles.
It would therefore be desirable to derive a bus interface unit which can modify the target address to that of an initial address used to access the first (lower most addressable) quad word within a sequence of quad words forming the cache line. By modifying the addressing seen by the peripheral device, the peripheral device can send data to the processor in burst fashion conducive to the peripheral device. That is, the peripheral device would like to dispatch data in sequential, increasing address order. If, somehow, the first quad word can be addressed as an initial quad word, then the remaining quad words in the cache line will naturally burst from the peripheral device without having to re-arbitrate for the peripheral bus or consume cycle time to effectuate turn-around.