The invention relates to a device and method to minimize data latency and maximize data throughput to and from memory using multiple data valid signals that also enable translation between linear and critical chunk addressing.
Microprocessor performance has seen incredible increases over the short history of computers. With this increase in processor performance, seen in the increased number of processor cycles per second, has come the need for a comparable increase in access speed to data and instructions. Otherwise, it provides little benefit to have a very fast processor if it is spending most of its time waiting for retrieval of data and instructions from memory. One method used to improve access speed to data and instructions is using cache memory which cycles at the same speed as the processor. However, cache memory is expensive and the amount available to a processor is thus limited. Therefore, a need exists to facilitate memory access to data and instructions.
In order to overcome this problem, computer manufactures have employed separate devices or chips to handle memory addressing, access, transfer, and retrieval when requested by a processor or other device. The use of these devices has improved performance since they are specifically designed to handle only memory access, but all too often they have proven to be complex, difficult to implement and still slow. Therefore, in some cases these devices actually form a bottleneck to maximum processor utilization. For example, when a read operation immediately follows a write operation of a given data location in memory it is often necessary, in some designs, to wait until complete transfer of all data involved in the write before execution of the read. This causes the processor or input/output (I/O) device requesting the read to wait needlessly for the completion of the write. Further, these devices frequently are required to interface to multiple ports in order to interface to the processors, input/output devices and memory. In those instances, where the devices take the form of a chip, it is often required to create separate data paths for each port which uses more space on the chip and thereby requires a larger chip that uses more space on the board, consumes more power and produces more heat.
Further, processors and other I/O devices may have specific requirements as to how data is to be ordered for presentation. Any device that accesses memory at the request of a processor or other I/O device must be able to translate from one form of desired presentation to another while still being able to keep latency and space used on the chip to a minimum and throughput to a maximum without unduly increasing the complexity of the logic required.
Therefore, what is needed is a device and method of accessing memory through multiple ports that minimizes data latency, maximizes data throughput without requiring a large number of data lines or complex logic. This device and method must also be able to translate from one data format to another without sacrificing latency or throughput.
An example embodiment of the present invention is directed to a device for servicing data read and write requests from a plurality of processors and an I/O interface connected to a plurality of I/O devices. This device uses a system data chip to receive a read request for data from one of the processors or the I/O interface. This system data chip also has a data buffer to store data in a first data format and a second data format received by the system data chip as a result of the read request. The system data chip also has a control/status unit to control when writing the data to the data buffer occurs and when reading from the data buffer occurs based on a first valid bit or a second valid bit.