1. Field of the Invention
The present invention relates to an improved bus architecture which decreases the bus access time in a tightly coupled multi-processor system.
2. Description of the Related Art
In a tightly coupled multi-processor system, all of the processors in the system share a common address and data bus, as well as a common system memory. When a multi-processor system employs a single common bus for address and data transfers, the bus must be restricted to one transfer at a time. Therefore, when one processor is communicating with the memory, all other processors are either busy with their own internal operations or must be idle, waiting for the bus to become free. The time that the processor spends waiting idle for the bus to become available is referred to as bus latency.
In a conventional multi-processor, common-bus architecture, the address bus is needed only for a short time while the addressed memory unit decodes the memory request. The correct memory board will then latch the address from the address bus. The address bus remains idle for the remainder of the data transfer. The data transfer time may be quite long depending upon the type of memory storage unit. Once the memory delivers the data to the data bus, and the requesting device releases the system bus, the address and data busses are released and become available to the other processors.
During the period that one processor is using the bus, the other processor must wait for the data bus to become available in order to initiate a data transfer. As the number of processors increases, the number of bus accesses increases, and, therefore, the bus latency increases. Inherent in typical bus access cycles are periods during which one processor holds the bus while it waits for a reply signal. During this time, the processor is not using the address bus. Rather, it holds the bus to prevent other processors from accessing the bus until it receives a reply. The time that the bus is held but not active while waiting for an acknowledge signal is a principal cause of bus latency in multi-processor systems.
Some multi-processor systems use a split transaction bus in order to cut down on the time that the bus is being held. In the split transaction bus, the address and data bus operate independently, thus allowing multiple requests to be outstanding. The requestor of the bus activates an address request to the address bus. Once the addressed device (e.g., memory module) latches the address and provides an acknowledgement, the requestor releases the address bus for other address requests. When the data is available from the addressed device, the device acquires the bus and delivers the data to the requestor. The address bus is therefore available for other memory requests while the memory system delivers the requested data to the requestor. The split transaction bus method reduces bus latency; however, the complexity of the system is increased dramatically. The memory boards for a split transaction system require the ability to queue and possibly sort requests, and must be provided with bus controller capabilities. The queue capability requires additional memory space to store and queue the outstanding requests and additional control logic to implement the bus controller.
In addition, depending on the system protocol, the amount of time that is saved between bus requests may decrease with increased bus transaction time. The memory access cycle time in a split transaction bus is typically longer then in a single bus system because each cycle includes steps to perform the queuing and bus control functions. If the queuing and bus control steps take longer than the time saved between transaction, the benefits of the "time saving" split transaction bus can quickly diminish. Without the return of a substantial decrease in the overall system memory access time, the increase in the complexity of the system that is required to implement a split transaction bus is often not justified. Maintaining cache coherency further complicates the implementation of a split transaction bus architecture.
A seemingly simple approach to reduce bus latency would be to increase the clock speed of the bus controller. By increasing the clock speed, the time for memory access necessarily decreases. However, this is an expensive approach that may require use of emitter-collector logic ("ECL") or other expensive materials in order to achieve the required increase in clock speeds.
Another attempt at reducing bus latency is the implementation of loosely-coupled processors. This approach has limited benefits in applications which may share common data structures. The level of bus arbitration will increase in order to resolve the multiple contention problems associated with a shared resource. The time spent on bus arbitration will reduce the overall time saved with loosely-coupled processors. Therefore, for shared resources, the system complexity increases, with little or no bus bandwidth increase.