The present invention relates to data management, and, in particular, to methods and apparatus for compressing and decompressing data for transfer and storage in a multiprocessing environment.
Computing systems are becoming increasingly more advanced, often tying multiple processors (coprocessors) together in order to boost processing speed and enhance overall performance. Often, computing systems integrate the coprocessors in parallel (or at least in concert) to increase processing efficiency. Such advances are critical to the success of many applications, for example, real-time multimedia gaming and other computation-intensive applications.
A multiprocessing system may include numerous coprocessors interconnected by a shared data bus. The coprocessors may have access to a shared memory such as a dynamic random access memory (DRAM). The DRAM may be located locally or remotely from the coprocessors. For example, the DRAM may be on a different part of the computer chip or on a separate chip. Each coprocessor may frequently accesses the shared memory in order to store or process data for a particular task. Access by one processor may be independent of access by the other coprocessors.
Data is sent to the shared memory by means of a direct memory access controller (DMAC). The DMAC allows high-speed data transfer without tying up the resources of a processor. This is because the direct memory access (DMA) transfer rate is only limited by the memory read/write cycle time and the DMAC's speed.
Conventional operation of a DMAC is well known. A typical DMA data storage process is as follows. A processor requests a data transfer to a DMAC with a source address, destination address, and the amount of data to be transferred. The DMAC requests the data transfer to a target device that is associated of the source address. When the target is ready for the transfer, the DMAC transfers the data to or from the target device. Some systems are designed to be able to send an interrupt to the processor indicating completion of the DMA transfer. A bus protocol used with a DMA transfer may be as follows. Initially, a processor loads the DMAC with a starting address and the amount of data to be transferred. When the data is ready for transfer, the DMAC sends a DMA request to the processor. The processor then acknowledges the DMA request, floats the data bus, address bus and control lines, and suspends processing that uses the data and address buses. The DMAC then begins transferring the data to the storage device. Once the data transfer is complete, the DMAC terminates the DMA request and sends an interrupt to the processor indicating completion of the DMA transfer.
In the past, connectivity to the shared memory has presented a bottleneck in data flow, notwithstanding the use of a DMAC. For example, the coprocessors may be able to transfer data along the shared data bus among each other at a rate of 256 Gbits/sec. In contrast, the data transfer rate with the shared memory may only be at a rate of 204.8 Gbits/sec. Alternatively, even though the data transfer rates may be the same, the DMAC may not be able to transfer data between the shared memory and multiple coprocessors at the same time. Thus, it can be seen that the shared memory bottleneck can slow down processing and impede system performance.
Therefore, there is a need in the art for new methods and apparatus for achieving high data transfer rates between multiple processors and a shared memory.