A bus architecture of a computer system conveys much of the information and signals involved in the computer system's operation. In a typical computer system, one or more buses are used to connect a central processing unit (CPU) to a memory and to input/output devices so that data and control signals can be readily transmitted between these different components. When the computer system executes its programming, it is imperative that data and information flow as fast as possible in order to make the computer system as responsive as possible to the user. With many peripheral devices and subsystems, such as graphics adapters, full motion video adapters, small computer systems interface (SCSI) host bus adapters, and the like, it is imperative that large block data transfers be accomplished expeditiously. These applications are just some examples of peripheral devices and subsystems which benefit substantially from a very fast bus transfer rate.
Much of the computer system's functionality and usefulness to a user is derived from the functionality of the peripheral devices. For example, the speed and responsiveness of the graphics adapter is a major factor in a computer system's usefulness as an entertainment device. Or, for example, the speed with which video files can be retrieved from a hard drive and played by the graphics adapter determines the computer system's usefulness as a training aid. Hence, the rate at which data can be transferred among the various peripheral devices often determines whether the computer system is suited for a particular purpose.
The electronics industry has, over time, developed several types of bus architectures. The PCI (peripheral component interconnect) bus architecture has become one of the most widely used and widely supported bus architectures in the industry. The PCI bus was developed to provide a high speed, low latency bus architecture from which a large variety of systems could be developed.
A PCI specification is used to establish standards to facilitate uniformity and compatibility of PCI devices operating in a PCI bus architecture. Initially, the PCI specification addressed only the use of 32-bit devices and 32-bit transactions, but the specification has since been extended to 64-bit devices and transactions.
Prior Art FIG. 1 shows a simplified exemplary PCI bus architecture 100 implemented, for example, in a computer system. PCI bus 120 is coupled to PCI initiator 110. PCI bus 120 is also coupled to each of PCI target devices A 112, B 114, C 116 and D 118. PCI initiator 110 can be integrated into bus bridge 130, as shown, and bus bridge 130 in turn is used to couple PCI bus 120 to a host bus (not shown). Bus bridge 130 is typically a bidirectional bridge and is made up of numerous components; for simplicity, bus bridge 130 is shown as comprising only PCI initiator 110.
PCI bus 120 is comprised of functional signal lines, for example, interface control lines, address/data lines, error signal lines, and the like. Each of PCI target devices 112-118 are coupled to the functional signal lines comprising PCI bus 120.
With reference still to Prior Art FIG. 1, PCI targets B 114 and D 118 are 32-bit target devices. That is, PCI targets B 114 and D 118 have addresses that encompass up to 32 bits, resulting in an address range of up to four (4) gigabytes (GB) in a 32-bit memory space. Similarly, PCI targets A 112 and C 116 are 64-bit target devices, having addresses encompassing up to 64 bits, which allow an address range of up to 16 exabytes in a 64-bit memory space. In addition, PCI bus 120 is a 64-bit bus and PCI initiator 110 is a 64-bit device. Such a mix of 32-bit devices and 64-bit devices is common in today's computer systems owing to the extension of the PCI specification to 64-bit devices, and it is required that these devices function together seamlessly as well as conform to the PCI specification. Therefore, to utilize the capabilities of the computer system to their fullest extents, PCI bus architecture 100 must be capable of performing 64-bit transactions between 64-bit devices, and must also support seamless transactions between 64-bit initiator devices and 32-bit target devices.
At the time when a 64-bit initiator generates a transaction, it is not aware of the attributes of the target device; that is, it does not know whether the target is a 32-bit device or a 64-bit device. Hence, to ensure compatibility regardless of the respective ranges of the initiator and target devices, in the prior art an assumption is made that the target device is only capable of handling a 32-bit operand. Thus, the prior art technique for transmitting a 64-bit address is to represent the 64-bit address as two 32-bit operands and drive the address over the bus using dual address cycles (also known as dual address commands, DACs), one cycle to transmit each of the 32-bit operands. Because two operands are passed across the PCI bus, two PCI clock cycles are needed to complete a DAC.
With reference now to Prior Art FIG. 2, timing diagram 200 is provided exemplifying a simplified read transaction using DACs according to the prior art. For simplicity, Prior Art FIG. 2 does not illustrate all of the signals associated with a read transaction, but only shows those signals pertaining to the discussion herein. Timing diagram 200 illustrates a read transaction initiated by a 64-bit initiator device over a PCI bus capable of supporting 64-bit transactions (e.g., PCI initiator 110 and PCI bus 120 of Prior Art FIG. 1).
Continuing with reference to Prior Art FIG. 2, PCI initiator 110 starts the transaction on the rising edge of PCI clock cycle 1 by asserting the FRAME# and REQ64# signals (at points 245 and 250, respectively). Generally, FRAME# is used to indicate the start of a transaction, and REQ64# to indicate that the transaction includes a 64-bit data transfer. These signals are known in the art and are as defined in the PCI specification.
In clock cycle 1, PCI initiator 110 also drives the lower portion of the address (e.g., low address 210) onto AD[31:0] and the upper portion of the address (e.g., high address 220) onto AD[63:32], and it continues to drive high address 220 onto AD[63:32] for the duration of both address phases of the DAC. During clock cycle 2, PCI initiator 110 starts the second address phase of the DAC by driving high address 215 onto AD[31:0]. All devices on the PCI bus latch onto these addresses, and during clock cycle 3 they decode the address. The target named by the address claims the transaction in clock cycle 3 by asserting the DEVSEL# signal (at point 240). On the rising edge of clock cycle 4, for a read transaction turn-around cycles 225 are inserted in AD[31:01] and AD[63:32]. Data A 230 and data B 232 are then driven onto the bus by the target device or by the initiator device depending on the type of transaction. Thus, in the prior art a 64-bit address is divided into two 32-bit addresses and transmitted via a DAC, even if the target device is a 64-bit device and therefore capable of reading a 64-bit address.
The prior art is problematic because a single address cycle (or single address command, SAC) cannot be used to transmit a 64-bit address as a single 64-bit operand to a 64-bit target device in the presence of a 32-bit target device, even if the 64-bit initiator knows that the target device is a 64-bit device. In accordance with the PCI specification, when an initiator device initiates a transaction, it drives onto the PCI bus the address of the target device with which the initiator device is seeking to perform the transaction. At this stage of the transaction, all target devices on the PCI bus latch onto the address, and then each target device decodes the address to determine whether or not it is the intended target named by the address. Hence, if a 64-bit address is transmitted over the PCI bus in a single address cycle, each 32-bit target device on the bus, as well as each 64-bit target device, latches onto the address. However, the 32-bit targets will only be capable of reading a portion of the address (namely, the lower 32 bits of the address), because these devices do not have access to the upper 32 bits of the address. In the likely case in which the lower half of a 64-bit address matches the 32-bit address of a 32-bit device, that 32-bit device will erroneously assert a claim to the transaction. In the meantime, the 64-bit device that is the intended recipient of the address will also assert a claim to the transaction after it decodes and recognizes its address, so that two devices will have asserted a claim to the same transaction.
Consider as an example a 32-bit target that is mapped into address 0000 0000h to 0000 FFFFh in a 32-bit memory space. A 64-bit initiator then specifies an address of 0000 0001 0000 1000h for a 64-bit target mapped into a 64-bit memory space. The 32-bit target latches onto the address but is only capable of reading the latter portion of the address, specifically the portion 0000 1000h, which, from the perspective of the 32-bit target, appears to fall within the range of addresses into which the 32-bit target device is mapped. Hence, the 32-bit target responds, as does the 64-bit target. This type of error is known as address aliasing. Address aliasing causes other types of errors to occur, such as incorrect data being sent, bus contention due to multiple and simultaneous drivers, and the like. Thus, the prior art is problematic because it does not allow a SAC to be used for a 64-bit address intended for a 64-bit target device because of address aliasing. In the prior art, if a SAC is used for a 64-bit address, then address aliasing will cause a 32-bit target to respond in error.
As can be seen from Prior Art FIG. 2, two clock cycles are needed to transmit a 64-bit address to allow a 32-bit target to read both phases of the address on AD[31:0] (a 32-bit target does not have access to AD[63:32]). Thus, another disadvantage to the prior art is that two clock cycles are used to transmit a 64-bit address when, for the case in which the intended recipient is a 64-bit target device, one clock cycle would be satisfactory. Therefore, in the prior art, data transfer subsequent to the address phase is delayed by one clock cycle. In addition, during the transaction, the PCI initiator requires ownership of the PCI bus, and thus the PCI bus is not available for other transactions. Thus, in the prior art, other transactions are also delayed because a portion of the computer system's data transfer bandwidth is consumed by the unnecessary clock cycle. This disadvantage is especially significant when multiplied by the number of transactions that occur on the PCI bus.
Accordingly, what is needed is a method and/or system which optimally utilizes the data transfer bandwidth of a computer system by eliminating DACs and hence the unnecessary expenditure of clock cycles associated with DACs. What is also needed is a method and/or system that addresses the above need and does not cause address aliasing and errors associated with address aliasing when SACs are used. The present invention provides a novel solution to the above needs.
These and other objects and advantages of the present invention will become obvious to those of ordinary skill in the art after having read the following detailed description of the preferred embodiments which are illustrated in the various drawing figures.