1. Field of the Invention
The disclosure relates to a method and apparatus for preventing bus livelock due to excessive MMIO commands. More specifically, the disclosure relates to a method and apparatus for identifying aggressive processors issuing excessive snoop commands over a given window of time.
2. Description of Related Art
In today's symmetric multi-processor (“SMP”) computers, multiple device driver threads can be executing on multiple cores concurrently. The threads can be logically contained within a single logical partition or spread across multiple logical partitions within the same physical computer. These types of computers have multiple cores, memory, and IO bridge controllers which are all connected to each other with a high speed SMP system bus. Various IO devices, such as disks, graphics adapters and network adaptors will be populated behind a single IO controller using various stages of IO hubs and bridges.
The device driver software threads will typically communicate with an IO device by executing cache inhibited memory mapped IO (“MMIO”) loads and stores instructions in order to service the adapter. The MMIO load/store instructions will be issued on the system bus by an MMIO bus master and the by a command will be serviced by the particular IO Controller that owns that address range. In an SMP system, multiple partitions running multiple device drivers can run concurrently. Since multiple IO devices can be populated behind a single IO controller, each of these device driver threads, running on various partitions, can concurrently utilize a single IO controller to communicate with their respective IO device.
SMP systems typically optimize the direct memory access (“DMA”) data flow for moving large amounts of data, but give less optimization for MMIO operations, which are typically used for administrative services. Device driver developers typically use MMIO commands to perform administrative services for the IO device and to setup DMA operations to move large amounts of data between memory and IO. With that said, some device drivers written for IO devices (such as 2d graphic's cards) choose to use primarily MMIO operations to update the display, instead of DMA operations. While this is not a problem in small computers with a single thread, it can become a problem in today's SMP computers if the machine is configured with multiple device driver threads communicating with multiple graphics cards behind a single IO Controller.
The reason for such problems is that the IO Controller is typically not designed to handle large amounts of MMIO streams to the same controller at the same time from multiple threads. The problem can be worse if the bus speeds used by the IO graphics cards are considerably slower then the SMP bus speed due to using older legacy PCI devices with newer/faster processors. In addition, some high performance applications such as InfinibandSM are moving from a DMA read/write programming model to a MMIO store/DMA write model for IO performance to cut latencies. However, these IO devices are still not as fast as the processor could issue MMIO stores and could create a constant IO bottleneck, putting added pressure on the existing IO architectures.
Although the IO Controller is not optimized for multiple MMIO bursts, it will eventually service the requests over time. If the IO Controller gets more MMIO requests on the SMP bus then it can service, it will retry the MMIO commands on the SMP bus until it can make forward progress in forwarding the commands to the IO adapter. The IO controller will attempt to service the multiple MMIO snoop requests on a first-come first-serve basis.
In an exemplary embodiment of a multi-processor system, an MMIO bus master could enter into a condition where it is constantly getting retried on the coherency bus. This condition has been called a system bus livelock. Conventional methods for dealing with a system bus livelock is limited to using a bus arbiter which implements a Linear Feedback Shift Register (“LFSR”) to randomize bus grants to bus masters thereby staggering requests to the target bus. The randomization can be used to break up any livelock that may be created between two or more bus masters trying to obtain ownership of the same resource.
The conventional hang recovery mechanism helps the bus masters get out of a livelock. However, the conventional hang recovery method hurts the other threads that are not in a livelock because the bus arbiter is throttling back the bus grant rate globally. Consequently, the entire system suffers even though one or two aggressive processors may have caused the livelock. Accordingly, there is a need for a method and apparatus to for preventing bus livelock due to excessive MMIO.