1. Field of the Invention
This invention relates generally to computer systems and more particularly to a method and apparatus for increasing bus transfer performance through the elimination of unnecessary wait states.
2. Description of the Related Art
Typically, central processing units (CPUs) are configured to operate significantly faster than processors associated with peripheral devices with which the CPU is in communication. Even within the same device, a CPU may operate faster than a graphics processing unit (GPU) that communicates with the CPU. Accordingly, when the CPU communicates with another device wait states may be used to allow the peripheral enough time to provide the requested data for the CPU.
FIG. 1 is a simplified schematic diagram of a central processing unit in communication with a graphics processing unit. CPU 100 communicates through bus 106 with GPU 102 within device 104. One scheme enabling CPU 100 and GPU 102 to communicate is to provide a predetermined number of wait states causing CPU 100 to wait for a number of clock cycles prior to reading data from GPU 102 or an external device. For example, where CPU 100 issues a read command to GPU 102, the CPU will wait for the programmed number of wait states, i.e., clock cycles, to read data placed on the data line by the GPU. In order to cover all of the possible components that may interface with CPU 100, the programmed wait states must allow for a time period sufficient for the slowest component to obtain the data. Therefore, this one size fits all solution causes the CPU to wait unnecessarily for a relatively fast peripheral component, e.g., GPU, thereby wasting time. Furthermore, the programmed wait state for the slowest case is not feasible when a two-dimensional function is being executed since the two dimensional function must complete prior to performing a memory read.
A second scheme for enabling the communication between the CPU and a peripheral device or another processor, is to use a wait line. Here, CPU 100 will issue a read command and GPU 102 asserts the wait line, i.e., sends a signal to the CPU to signify that the GPU needs more time to complete the cycle. CPU 100 monitors the wait line and once the wait line is de-asserted, the CPU reads the data. Here again, for faster controllers that do not assert the wait line, the CPU will still wait for a certain number of clock cycles to look at the wait line. Typically, the CPU will wait for three or four clocks into the cycle to look at the wait line. Thus, with respect to a fast processor or peripheral device, the CPU wastes the three or four clock cycles.
As a result, there is a need to solve the problems of the prior art to provide a CPU that is capable of being programmed to use either an external wait line or software wait states depending on the device or microprocessor that is in communication with the CPU.