1. Field of the Invention
The present invention relates generally to computer input/output subsystems, and more particularly, to arbitration of peripheral component interconnect (PCI) bus based functions of multi-function add-on cards.
2. Description of the Related Art
A peripheral component interconnect (PCI) bus is commonly utilized in conventional computer systems. A PCI bus can couple various devices with other components of the computer system. For instance, the PCI bus may connect devices such as video cards or network cards to the processor memory of the computer system. The PCI bus is a standard in the computer industry and typically allows "plug-and-play" such that the PCI device coupled to the PCI bus configures itself without user intervention.
Typically, when a PCI device function requests data from another component of the computer system, for example from memory, additional data will be cached in anticipation for the next request based on the assumption that the next request will be for the next sequential amount of data. For instance, one cache line (typically 32 bytes for Intel machines) in addition to what was requested, can be retrieved and cached in anticipation for the next request.
PCI devices can be multi-functional such that a single device can perform several functions. Typically, most multi-function devices based on PCI local bus architecture use an internal arbiter for determining which device function will be allowed to use the PCI bus at a given time. For example, a PCI device can be a multi-function PCI device which includes more than one master device, for example, two Small Computer System Interface (SCSI) cards. Functions within these master devices may substantially simultaneously request use of the PCI bus. The arbitrator determines which function will obtain or retain the PCI bus. Arbitration has significant bearing on the system latency and performance since it decides which function will obtain the bus and how long that function will hold the bus.
There are typically two parameters to an arbitration scheme: throughput and latency. Throughput describes the amount of data which is transferred from the PCI device to the memory. Latency is typically an average time for the requesting function to obtain the PCI bus. It is desired to have a high throughput and a low latency.
A common method for determining which function will access the PCI bus is a method typically referred to as "round robin". The round robin method alternates priority of the functions. For example, if function zero obtains use of the PCI bus, then the next time function zero and function one substantially simultaneously request the PCI bus, then function one will have priority over function zero, and function one will obtain use of the PCI bus. However, using the round robin method with a PCI local bus architecture wastes the bandwidth of the data retrieval. The additional data which was cached is wasted since the first request was made by a first function while the next request was made by another function. The request made by the other function will most likely request data from a completely different address from that made by the first function. Accordingly, the additional data which was cached in anticipation for the next data request typically only applies to the first function which initially requested the data. For example, if function zero has obtained the use of the PCI bus and it has requested data near and up to address 1000, then the data retrieved would be the data up to address 1000 plus the next cache line (typically 32 bytes) after address 1000. The data up to address 1000 will be sent to function zero which requested it, while the extra cache line which was retrieved will be cached in anticipation for the next request from function zero which will most likely be the next cache line. However, in a round robin scheme, if both function zero and function one requests the PCI bus approximately simultaneously, then function one will obtain the use of the PCI bus, assuming function zero had priority the last time. Most likely, function one will request data from a completely different address, such as address 10,000. In this case, the data which has been cached in anticipation for function zero's next request has been wasted.
One possible solution to avoid wasting of the additional cached data in the PCI Bridge would be to allow function zero to finish the entire transaction which it has requested prior to transferring the use of the PCI bus to function one. However, if function zero is allowed to finished the entire transfer, latency will rise significantly since function one must wait until function zero finishes all its transfer. The increase in latency could be as much as one millisecond.
Another possible solution is to allow one of the functions to have a fixed priority such that the priority function obtains the PCI bus every time that particular function requests the bus. Fixed priority reduces latency for the higher priority devices. However, fixed priority causes the latency of the non-priority function to be very high and can cause the throughput of the non-priority function to be very low.
What is needed is a system and method which optimizes the three goals of having a high throughput, a low latency, and efficiently utilizing the caching scheme of the additional information. The present invention addresses such a need.