(1) Field of the Invention
The present invention relates in general to the field of computers, and more particularly to a computing system and a method for fast accessing the peer-to-peer cycles in Peripheral Component Interconnection Express (PCI-Express).
(2) Description of the Prior Art
A computer, or a computing system, is a type of data processing system. Examples of the computer include a server, a workstation, a desktop computer, a notebook computer, a laptop computer, and a hand-held device. Typically, a computer system includes of a microprocessor and memory.
The computing system may also include peripheral devices, such as a keyboard, a mouse and disk drives that connect to the computer via input/output (I/O) ports. The I/O ports allow the peripheral devices to communicate with the processor through a bus such as peripheral component interconnection (PCI) bus. In general, the bus can be either a parallel or a serial interface for connecting the peripheral devices to the computer system.
As consumers demand faster data processing speed and performance, some innovative devices have exceeded the capabilities of current bus architectures such as the conventional PCI bus. The innovative devices include high performance graphics cards, high speed memory, high speed microprocessors, high bandwidth networking, and other high speed devices. These innovative devices have created a need for a high performance and greater bandwidth interconnections. In order to meet this need, a new interconnection architecture, commonly referred to as PCI Express (or PCI-E) architecture, has been developed to provide the high speed interconnection and peer-to-peer access capability.
PCI-Express is a general purpose input/output (I/O) serial interconnection that provides a high performance interconnection for attaching devices such as high performance graphic cards, universal serial bus (USB) ports, networking and other such devices. Because the PCI Express architecture may connect to several different types of devices, the architecture provides a standard for communications in order to consolidate these devices on a single interconnection.
FIG. 1 is a block diagram of a prior computing system 10 employing the PCI-Express architecture. The computing system 10 includes a microprocessor 12, a chipset 14 and a plurality of PCI-E ports 16. The chipset 14 includes a port arbiter 141 and a Downstream Address Range Decoding logic (DARD logic for short) 143. No matter “onboard access” or “peer-to-peer access” of the prior art on FIG. 1, the upstream requests from the PCI-E ports 16 are sent to microprocessor 12. Said “onboard access” means an access is processed by the microprocessor 12; and said “peer-to-peer access” means an access between two PCI-E ports 16, which needs no process from microprocessor 12.
The peer-to-peer access doesn't need any process directly from microprocessor 12; furthermore, the chipset 14 doesn't decode neither the onboard address range nor PCI-E root port memory range for upstream requests. The upstream requests of peer-to-peer access are sent to microprocessor 12 by the port arbiter 141, then microprocessor 12 redirects the requests and issues the corresponding downstream cycles to the DARD logic 143, and then to the designated PCI-E port 16. As a result, the long peer-to-peer access path, PCI-E port 16→chipset 14→microprocessor 12 →chipset 14→another PCI-E port 16, will induce long access latency and thus make some isochronous applications, such as dual-engine graphic card, infeasible.
FIG. 2 is a block diagram of another prior computing system 20 employing PCI-Express architecture. The computing system 20 includes a microprocessor 22, a chipset 24 and a plurality of PCI-E ports 26. The chipset 24 includes a port arbiter 241, a DARD logic 243, a Upstream Onboard Range Decoding logic (UORD logic for short) 245 and a downstream arbiter 247. In the design, the computing system 20 uses the UORD logic 245 to distinguish the onboard access from the peer-to-peer access. The peer-to-peer access will be arbitrated to the downstream arbiter 247 and then sent to a specified device (of the specified PCI-E port 16) according to the decoding result of the device range of DARD logic 243.
The advantage of this design compared to the previous scheme shown on FIG. 1 is that the peer-to-peer access path is shortened. As shown on FIG. 2, the peer-to-peer access path is not routed through the microprocessor 22.
However, the peer-to-peer access scheme is mainly designed for legacy device, such as PCI 1 access PCI 2. Therefore, the data buffer size and access length are usually small and limited, which increase access latency and may not meet the requirements of some graphic applications, such as dual-engine graphic card that requires isochronous access.
Besides, two address decoding logics, upstream onboard range decoding logic 245 and downstream address range decoding logic 243, within the peer-to-peer access path will also worse the access latency.