A bus architecture of a computer system conveys much of the information and signals involved in the computer system's operation. In a typical computer system, one or more buses are used to connect a central processing unit (CPU) to a memory and to input/output devices so that data and control signals can be readily transmitted between these different components. When the computer system executes its programming, it is imperative that data and information flow as fast as possible in order to make the computer system as responsive as possible to the user. With many peripheral devices and subsystems, such as graphics adapters, full motion video adapters, small computer systems interface (SCSI) host bus adapters, and the like, it is imperative that large block data transfers be accomplished expeditiously. These applications are just some examples of peripheral devices and subsystems which benefit substantially from a very fast bus transfer rate.
Much of the computer system's functionality and usefulness to a user is derived from the functionality of the peripheral devices. For example, the speed and responsiveness of the graphics adapter is a major factor in a computer system's usefulness as an entertainment device. Or, for example, the speed with which video files can be retrieved from a hard drive and played by the graphics adapter determines the computer system's usefulness as a training aid. Hence, the rate at which data can be transferred among the various peripheral devices often determines whether the computer system is suited for a particular purpose.
The electronics industry has, over time, developed several types of bus architectures. The PCI (peripheral component interconnect) bus architecture has become one of the most widely used and widely supported bus architectures in the industry. The PCI bus was developed to provide a high speed, low latency bus architecture from which a large variety of systems could be developed.
A PCI specification is used to establish standards to facilitate uniformity and compatibility of PCI devices operating in a PCI bus architecture. Initially, the PCI specification addressed only the use of 32-bit devices and 32-bit transactions, but the specification has since been extended to 64-bit devices and 64-bit transactions. Hence, a typical PCI bus system can include both 64-bit and 32-bit devices.
Prior Art FIG. 1 shows a simplified exemplary PCI bus architecture 100 implemented, for example, in a computer system. PCI bus 120 is coupled to PCI initiator 110. PCI bus 120 is also coupled to each of PCI target devices 112 and 114. PCI target 112 is a 32-bit target device and PCI target 114 is a 64-bit target device. In addition, PCI bus 120 is a 64-bit bus and PCI initiator 110 is a 64-bit device.
PCI initiator 110 can be integrated into bus bridge 130, as shown, and bus bridge 130 in turn is used to couple PCI bus 120 to a host bus (not shown). Bus bridge 130 is typically a bi-directional bridge and is made up of numerous components; for simplicity, bus bridge 130 is shown as comprising only PCI initiator 110.
PCI bus 120 is comprised of functional signal lines such as, for example, interface control lines, address/data lines, error signal lines, and the like. Each of PCI target devices 112 and 114 are coupled to the functional signal lines comprising PCI bus 120. The functional signal lines provide a data path between PCI initiator device 110 and the PCI target devices.
With reference now to Prior Art FIG. 2, timing diagram 200 is provided exemplifying a simplified 64-bit transaction according to the prior art. Timing diagram 200 illustrates a 64-bit transaction between a 64-bit initiator device and a 32-bit target device over a PCI bus capable of supporting 64-bit transactions (e.g., PCI initiator 110, PCI target 112 and PCI bus 120 of Prior Art FIG. 1).
Referring still to Prior Art FIG. 2, a host device (not shown) requests a 64-bit write data transfer to PCI target 112; that is, PCI initiator 110 is to forward 64 bits (eight bytes) of data to PCI target 112. The 64 bits of data are formatted as two 32-bit operands, referred to as data-1 and data-2 in timing diagram 200.
Associated with each byte of data is a command/byte-enable (hereinafter, byte-enable, BE). Byte-enables are driven by PCI initiator 110 and read by PCI target 112. BE[3:0] corresponds to the lower 32-bit portion of the data (e.g., data-1) and BE[7:4] corresponds to the upper 32-bit portion of the data (e.g., data-2). Byte-enables are known in the art and are used to indicate the bytes to be transferred and the data path to be used to transfer the data. When the value of a byte-enable bit is equal to one, the byte-enable is said to be disabled and the data byte corresponding to that byte-enable is discarded.
Continuing with reference to Prior Art FIG. 2, PCI initiator 110 starts the 64-bit transaction during PCI clock cycle 1 by asserting the FRAME# and REQ64# signals. Generally, FRAME# is used to indicate the start of a transaction, and REQ64# to indicate that the transaction includes a 64-bit data transfer. These signals are known in the art and are as defined in the PCI specification.
In clock cycle 1, PCI initiator 110 also drives the address onto AD[31:0] (address/data bits 0 through 31). All devices on the PCI bus latch onto this address, and during clock cycle 2 they decode the address. The target named by the address (e.g., PCI target 112) claims the transaction in clock cycle 2 by asserting the DEVSEL# signal. In this case, the target device is a 32-bit device and so ACK64# is not asserted (assertion of ACK64# is used to indicate that the target device is a 64-bit device). However, at this point in the process, PCI initiator 110 does not know that the target device is a 32-bit device, and so in clock cycle 2 PCI initiator 110 drives data-1 onto the bus in AD[31:0] and data-2 onto the bus in AD[63:32] (address/data bits 32 through 63). PCI initiator 110 samples ACK64# deasserted and so recognizes that PCI target 112 is a 32-bit device. Accordingly, in clock cycle 3, PCI initiator retransmits data-2 over AD[31:0].
In some data transactions, all of the byte-enables associated with the upper portion of the 64-bits of data are disabled; that is, in some transactions, BE[7:4] is equal to 1 1 1 1. In the prior art, data-2 is still transmitted to PCI target 112 when BE[7:4] are all disabled; however, PCI target 112 disregards data-2 when that is the case. This is referred to as a null data-phase transfer.
As seen by timing diagram 200, a clock cycle is needed to transmit data-2 even though the data contained therein are discarded by the PCI target. Therefore, the prior art is problematic because a clock cycle is unnecessarily consumed by the null data-phase transfer. Any subsequent activities associated with the present transaction are also delayed by one clock cycle. In addition, during the present transaction, PCI initiator 110 requires ownership of PCI bus 120, and thus the PCI bus is not available for other transactions. Thus, in the prior art, other transactions are also delayed because a portion of the computer system's data transfer bandwidth is consumed by the unnecessary clock cycle. Therefore, in the prior art, the system's available bandwidth is not being optimally utilized. These disadvantages are especially significant when multiplied by the number of transactions that occur on the PCI bus.
Another disadvantage to the prior art is that power is expended due to the null data-phase transfer from a 64-bit initiator to a 32-bit target. As shown by timing diagram 200, PCI initiator 110 asserts REQ64#, drives signals for BE[7:4], and drives data down AD[63:32]; however, these actions prove to be unnecessary because in actuality only 32 bits of data are transferred over AD[31:0] and BE[7:4] are all disabled. Therefore, the prior art is problematic because the unnecessary consumption of power is contrary to the goal of low power consumption called for in the PCI specification. In addition, laptop computer systems and the like are frequently powered by batteries, and thus another disadvantage to the prior art is that the computer system's battery may need to be recharged more frequently, thereby inconveniencing the user and perhaps shortening battery life.
Accordingly, what is needed is a method and/or system that eliminates or reduces the occurrence of the null data-phase transfer as described above, in order to more effectively utilize the computer system's data transfer bandwidth. What is further needed is a system and/or method that addresses the above need while reducing power consumption. The present invention provides a novel solution to the above needs.
These and other objects and advantages of the present invention will become obvious to those of ordinary skill in the art after having read the following detailed description of the preferred embodiments which are illustrated in the various drawing figures.