1. Field of the Invention
The invention relates to load balancing. More particularly, the invention relates to balancing loads of a plurality of bus lanes of a snooping-based bus.
2. Related Art
A snooping-based bus or fabric and the units or processing units coupled to the snooping-based bus can form a cache coherent system. Such a cache coherent system can use a cache coherency protocol. An example for such a cache coherency protocol is PowerBus.
In a cache coherent system, a plurality of memory accesses of the processing units needs to be kept coherent with the caches of the other processing units. One method to achieve this is the usage of snooping cache coherency protocols.
Using such a snooping cache coherency protocol means that the address for each access is distributed to all relevant processing units which can check whether they have a copy of the effected data item in their local cache and which state, e.g. shared, exclusive, modified, it is in.
The address can be distributed by a requesting processing unit by means of a snoop request or snoop command. In response to the snoop request, the requesting processing unit receives snoop responses from the other bus-coupled processing units. In this regard, all the responses can be collected by the bus and a combined answer can be either sent to the requesting processing unit or, like in PowerBus, distributed to all bus-coupled processing units.
While the data of a snoop transaction, including a snoop request and snoop responses, is conventionally transferred point to point, the snooping information consists of one or two, one-to-many, and a many-to-one steps. Therefore, a very high rate of snoop requests need to be delivered to the processing units of a bus coupled to a plurality of processing units.
To achieve this, parallel bus lanes are used, and a hash function determines which bus lane is to be used dependent on the address of the snoop request. The bus, in particular a bus access unit of the bus, collects the snoop requests from a plurality of processing units and arbitrates them onto the lanes for snooping.
Even if by randomness and quality of the used hash function, the average load on the bus lanes is the same, it can happen that some bus lanes are higher loaded than other bus lanes in the short term. These short term hot spots can limit the utilization of the bus.
One way to reduce these short-term hot spots is to increase the buffer sizes in the bus. This can allow the bus to see a larger set of requests to choose from. However, these buffers are redundant with the control structures within the processing units which have to keep track of their outstanding transactions disadvantageously.
Document U.S. Pat. No. 6,304,945 B1 describes a method and an apparatus for maintaining cache coherency in a computer system having multiple processor busses. The computer system includes a plurality of processor buses, and a memory bank. The plurality of processors is coupled to the processor buses. At least a portion of the processors have associated cache memories arranged in cache lines. The memory bank is coupled to the processor buses. The memory bank includes a main memory and a distributed coherency filter. The main memory is adapted to store data corresponding to at least a portion of the cache lines. The distributed coherency filter is adapted to store coherency information related to the cache lines associated with each of the processor buses. A method for maintaining cache coherency among processors coupled to a plurality of processor buses is provided. Lines of data are stored in a main memory. A memory request is received for a particular line of data in the main memory from one of the processor buses. Coherency information is stored related to the lines of data associated with each of the processor buses. The coherency information is accessed based on the memory request.