1. Field of the Invention
The present invention relates to a method and apparatus for implementation of a dual hold protocol in a multiprocessor system. More particularly, the present invention relates to a method and apparatus for implementation of a quick hold protocol which enables slave processors to gain access to the bus more rapidly.
2. Art Background
In a microprocessor system, such as the one illustrated in block diagram form in FIG. 1, a plurality of devices are coupled via a bus 10. For example, there will be a CPU, which functions as the master device of the bus 15, as well as some slave devices 20 which can be co-processors, DMA devices or the like, memory 25 which is accessible by the devices coupled to the bus 15, 20 and an arbiter 30, which arbitrates device access to the bus. The master device 15 will by default have control of the bus 10 as well as highest priority of access to the bus 10. Therefore, the master 15 will maintain control of the bus 10 until asked to release the bus 10 to a device which requires access to the bus 10.
For example, in a system based upon an 80486 microprocessor, manufactured by Intel Corporation, Santa Clara, Calif., a hold protocol has been developed to enable slave devices to access the bus. Using the hold protocol, a slave device 20 will issue a bus request to the arbiter 30 to gain access to the bus. The arbiter will decide whether the device 20 has priority to access the bus, and if so, the arbiter 30 will assert a HOLD signal to the master 15. When the CPU is not utilizing the bus, it will monitor the HOLD signal and, when issued by the arbiter 30, the CPU will release the bus and assert a hold acknowledge (HLDA) signal to the arbiter 30. The arbiter then will issue a bus grant signal to the slave device 20 and the slave device will then access the bus. The master 15 will subsequently gain access when the slave device 20 releases the bus or when the master 15 issues a bus request to access the bus and the arbiter returns the bus to it.
The HOLD, HLDA and bus request signals form the hold protocol utilized. Typically, the master device 15 will poll the HOLD signal line during the boundary between bus cycles. A bus cycle is defined to be a number of clock cycles required for both the address and corresponding data transaction to occur. For example, if the CPU 15 issues a read request to memory 25, the address of the operation is first communicated across the bus 10 to memory 25. Memory 25 then responds by providing the data across the bus 10 to CPU 15. After the bus cycle in which the address is issued and the data transmitted is complete, the CPU checks the state of the HOLD signal to determine if there is an outstanding request to gain access to the bus. If a HOLD signal has been issued by the arbiter 30, the CPU 15 will issue a HLDA signal and release the bus for access by another device.
In order to increase the throughput of the microprocessor system, address pipelining has been developed. Examples of address pipelining are found in the i386.TM., i486.TM. SL, and Pentium.TM. processors manufactured by Intel Corporation, Santa Clara, Calif. Address pipelining permits multiple addresses to be issued by the CPU without waiting for the corresponding data transaction to complete. Although this increases the efficiency of the microprocessor 15, the HOLD-HLDA protocol latency is extended because the CPU cannot relinquish the bus (and assert HLDA) until the data transactions for all outstanding addresses are completed (upon reception of a HOLD signal, the CPU ceases the issuance of new addresses).
Many microprocessors also provide for burst mode data transfer. Burst mode data transfers enable the transfer of multiple blocks of data in response to a single address issued. Thus, the bus cycle is complete when all blocks of data, responsive to the burst mode address issued, have been transmitted across the bus. As a transfer of multiple blocks of data requires more clock cycles than the transfer of a single block of data, higher latency in responding to hold requests is incurred during burst mode data transfers.
Latency is further introduced in systems which, for reasons such as power savings, divide down the clock frequency of operation. For example, if the clock frequency is divided in half, the latency incurred for determining whether an outstanding hold request exists is doubled. This latency problem is especially problematic when devices have time constraints which cannot be met due to the latency introduced by slower clock cycles and address pipelining. An example is a DMA device coupled to DRAM. The DRAM requires refresh periodically in order to maintain accurate states of data. The DMA must gain access to the bus within the maximum allotted time for refreshing the DRAM. However, it is quite possible with slower clock cycles and address pipelining, that the device will not gain access to the bus within the needed amount of time as the CPU does not process the hold request fast enough and the bus cycle boundaries where hold requests are processed are further apart.