The present invention relates generally to bus arbitration and more specifically for a custom PCI arbiter.
There are limitations in the standard PCI (Peripheral Component Interconnect) arbitration scheme as well as standard microprocessor PCI arbitration implementations which limit the guaranteed latency with which a specific data packet can be transferred across the PCI bus, for example between host and radio MAC (Media Access Control) devices. This non-optimal data transfer latency limits performance (multi-radio simultaneous burst throughput) as well as features (power-save, multicast) in an enterprise class access point system. Additionally, system cost is increased because radio MAC packet storage space must be large because there is not a mechanism available to fetch data packets from host DRAM in a timely manner.
While the raw throughput of the PCI bus may be sufficient to support such applications, the limitation of a system with a standard PCI arbiter employing standard radio chipsets (client class) and standard microprocessor is that high-priority DMA-based data transfers cannot be effectively prioritized over those data transfers without specific latency requirements. The PCI specification only provides a mechanism to provide bus ownership based on the device which is requesting the bus; however, it does not allow bus ownership to be assigned based on the type of data which needs to be transferred. This is a fundamental limitation of the PCI arbitration scheme as defined by the PCI specification.
The PCI specification has fundamental limitations with regard to arbitration for PCI bus mastering ownership. For example, the PCI specification has no mechanism for prioritizing bus control (bus master) to devices based upon classification of data type which needs to be transferred—prioritization can only be done on a per-device basis. Furthermore, the PCI specification has no mechanism for optimizing bus-master switching overhead by being aware of latency requirements of individual data types, rather, all bus allocation algorithms are performed only with a general knowledge of the requesting device and without specific knowledge of the data being transferred.
Moreover, standard PCI arbiter designs common to integrated MPU designs have other limitations. For example, the standard PCI arbiter has no ability to prioritize a higher-priority data transfer with specific latency requirements over a data transfer with less stringent timing requirements—rather the standard PCI arbiter employs a fair or weighted round-robin arbitration scheme providing a static allocation of bus bandwidth to each device A round-robin time-sliced arbitration scheme does not optimize bus efficiency due to switching overhead because bus cycles are wasted switching between devices unnecessarily and look-ahead pre-fetches of data are wasted when bus master is switched.
These limitations in the PCI arbitration scheme can impact the performance of devices using PCI systems. For example, an 802.11 access point has limited Radio MAC memory space, thus a low-latency transfer of certain data types from host memory is necessary to meet performance and feature-set requirements of an enterprise class access point system. The standard PCI arbitration as defined by the PCI specification and implemented in standard universal MPU designs does not allow the bus-transfer latency requirements to be met. Moreover, this problem is magnified as number of radios per system is increased.