Modern computer systems often have a plurality of processors interconnected by a shared bus. The shared bus is used for processor to processor communication and for communication between processors and memory or peripheral devices. Typically, at any one time, only one processor can have control of the shared bus. Typically, there is an arbitration method, called an arbitration protocol, for determining which processor gains control of the shared bus when multiple processors simultaneously contend for control.
Processors in a multiprocessor system may have local cache memory for fast access to information. The information in each processor's cache must be accurate and current. This is called cache coherency. A shared-memory cache-coherent multiprocessor system has multiple processors, each with local cache memory and a system for maintaining coherency of each local cache. In a shared-memory cache-coherent multiprocessor system, if any processor changes data in its cache, subsequent accesses of the same data by other processors must reflect the change. Therefore, the change must propagate to all other caches in the system. When a processor gains control of the bus, other processors may not need to do anything, or other processors may have to flush or purge and later refill cache memory. The flush or purge must be accomplished for each cache in all processors in the system before the original transaction can be considered complete. Therefore, in general, transactions which affect cache memory require a variable amount of time.
In general, computer bus designs and arbitration protocols need to accommodate variable length transactions and responses, efficient bursts of transactions, and bounded waiting. One example of a variable length transaction is cache memory updating as described above. Another example is a request for data from shared memory. If the memory device is slow, the memory device might assert a "busy" signal to hold off arbitration requests by other processors.
Alternatively, the system might provide for split transactions in which a first device arbitrates for control of the bus, sends a request to a second device and then releases control immediately. Then at some later time, the second device arbitrates for control of the bus to provide the response. However, a second arbitration is inefficient. The bus design needs to automatically accommodate access by a responding device without requiring a second arbitration. In particular, in a shared-memory cache-coherent multiprocessor system, there will typically be at most one processor that needs to respond to another processor's transaction. That response needs to be automatic without requiring a second arbitration.
Some transactions naturally occur in bursts. For example, in a data base application, a request for a record might require retrieval of multiple sequential fields of data. A computer bus design and arbitration protocol needs to accommodate burst transactions by minimizing the time and overhead involved in allowing a device which has control of the bus to temporarily keep control of the bus. However, this goal must be balanced against a need to provide bounded wait time (eventual access) to all devices, regardless of priority.
In the computer industry, there is an ongoing need to make computer systems smaller, faster, and lower cost. In general, computer systems are being consolidated into a relatively few large scale integrated circuits. A significant factor in the size, speed and cost of an integrated circuit is the number of external connections to the integrated circuit. A computer bus typically comprises many signal conductors, most of which need to attach directly to devices on the bus. Prior art shared-memory cache-coherent multiprocessor systems are typically large and expensive with separate bus electronics. Therefore, prior art systems have not had a high design priority to minimize signal lines. However, as more of the bus electronics are integrated within processors, there is a need to minimize the number of bus signal conductors to minimize size and cost of attached integrated processors.
Multiple signals can be encoded on a single conductor. However, in general, this reduces speed because of encoding/decoding time. In general, for speed considerations, conductor minimization needs to be done by methods other than encoding.
Conductors can also be made bidirectional, with multiple drivers and receivers. This reduces speed due to impedance and transmission line considerations. In general, a unidirectional signal conductor with a single signal source can be made faster than an equivalent cost bidirectional system or it can be made at lower cost than a bidirectional system with equivalent speed.