The invention relates generally to computer systems and more particularly to a digital computer system or processor having a split transaction memory access structure.
As the demand for faster microprocessors increases, there is a need for decreasing the latency of data processing. With reference to FIG. 4, the data transfer time 211, 212 is the time needed to transfer data between a memory and a processing engine. The processing time 201, 202 is the time needed to process all of the transferred data. The processing latency 225 is the time from the beginning of the transfer to the completion of the data processing. In the prior art, the data processing 201, 202 usually starts after the data transfer 211, 212 is completed. As shown in FIG. 4, this makes the processing latency 225 equal to the entire data transfer time 211 plus the entire data processing time 201.
FIG. 6 shows a typical direct memory access (DMA) computer system 300 as is known in the prior art. The computer 300 includes a system bus 310, a plurality of processors 320a . . . 320n coupled to the system bus 310, a main memory 330 coupled to the system bus 310, and a bus arbiter 340 coupled to the system bus 310. In the DMA computer system 300, a memory read by the process 320a to obtain a data unit (byte, or word) can be carried out as follows. The processor 320a requests for system bus use. If there are multiple bus requests pending, the bus arbiter 340 grants the system bus 310 to one of the requesters according to some bus allocation policy. When the system bus 310 is granted to the processor 320a, the processor 320a sends a read command including a read address to the main memory 330 via the system bus 310. The processor 320a then releases the system bus 310 so that the system bus 310 can be granted to another bus requester, if any. In response to the read command from the processor 320a, the main memory 330 retrieves the requested data unit and request for system bus use. When the system bus 310 is granted to the main memory 330, the main memory 330 sends the requested data unit to the processor 320a via the system bus 310. When the transfer is complete, the main memory 330 releases the system bus 310 so that the system bus 310 can be granted to another bus requester, if any. After obtaining the requested data unit, the processor 320a can process it. Then, the processor 320a can repeat the process described above to obtain another data unit from the main memory 330 for processing. If the processor 320a needs to obtain a large number of data units from the main memory 330, the processor 320a has to spend a lot of time issuing multiple read commands to the main memory 330. This prevents the processor from performing other tasks.
In the prior art, there have been attempts to decrease processing latency. Many of these attempts involve using a cache memory in the microprocessor to carry out the read operation. For example, U.S. Pat. No. 5,940,856 describes a method to decrease the latency of a read operation. When a first processor writes data in the cache memory, the data entry is marked as notified. If a second processor wants to read that data, the method provides a fast way to bring the data to the second processor by reading data from the cache memory of the first processor, rather than writing to and reading from external memory.
U.S. Pat. No. 5,761,724 discloses a system in which a first processor has modified data in its cache memory. In the case where a second processor tries to read that data, and the modification has not yet been reported to the external memory, the cache memory will provide the correct data to the second processor. This saves time by not accessing the external memory because the correct data is maintained in the cache memory.
It is the object of the present invention to provide a digital system and method that decreases the processing latency time so that the processor is available to perform additional tasks.
The above objects have been achieved by a digital system having a split transaction read buffer coupled between a processor and a system bus. The action to read data is split into two different actions, where the first transaction requests a read and the second read response transaction subsequently provides the requested data. The read buffer is implemented with two buffers, one incoming data buffer for reading data, and a second outgoing address buffer for sending read requests. The digital system can read the data from the data buffer while the data transfer is in progress. By being able to process the data as soon as the data is present in the read buffer, rather than waiting until the data transfer is complete, the processing latency time is reduced so that the processor is free to perform other additional tasks.