Microprocessors, digital signal processors, digital imaging devices, and many other types of digital data processing devices rely on a memory system to store data and/or instructions needed by the processing device. FIG. 1 depicts a typical memory system configuration 100. Memory system 100 comprises a memory 140 to store digital data and memory controller 110 to control access to memory 140. An address/command bus and a data bus transmit memory signals, e.g., on set of signal lines, between memory controller 110 and memory 140. Memory signals fall generally into one of several categories including data signals, address signals, command signals, and the like. Data signals carry the actual data that will be stored in, or retrieved from, memory 140, and pass across data bus. Address signals specify the location within memory 140 where data is to be read from or written to. Command signals instruct memory 140 as to what type of operation is to be performed, e.g., read or write.
A processing device 50 issues data store and retrieve requests to memory controller 110. The processing device 50 may be a processor or any device capable of processing or manipulating electronic signals. The memory controller 110 acts as an intermediary for the exchange of data between processing device 50 and memory 140. For instance, when the processing device 50 issues a retrieve request, the memory controller 110 retrieves data from memory 140 and provides the retrieved data to processing device 50. The memory controller 110 retrieves data from memory 140 over the data bus by providing appropriate address and control signals to the memory 140 over the address/command bus.
As processing devices become faster and more powerful, the increased demands placed on them generally translate to a need for larger and faster memory systems. FIG. 2 shows one memory system implementation 200 that addresses the increased demands. A distributed memory 220 contains multiple memories 140, 150, and 160, each to store digital data within a predetermined address space. Each address space may be a predetermined address range or multiple interleaved ranges within a monolithic memory space. Or each address space may address space separately located in physically distinct memories. For ease of programmability and to maintain backward capability of memory system 200, processing device 50 typically perceives distributed memory 220 as one monolithic memory space regardless of the actual configuration.
Memory interface 210 comprises memory controllers 110, 120, and 130 that control access to memories 140, 150, and 160, respectively, where each memory access path is known as a memory channel. When processing device 50 stores or retrieves data from address space corresponding to memory 140, processing device 50 issues a data store or a retrieve request to memory controller 110, where memory controller 110 acts as an intermediary for the exchange of data between processing device 50 and memory 140. Memory controllers 120 and 130 perform similarly to memory controller 110 with respect to data exchanges between processing device 50 and memories 150 and 160, respectively, when device 50 issues a data store or a retrieve request. By increasing the number of memory channels and by distributing memory 220, memory system 200 allows processing device 50 to perform multiple independent memory accesses to memories 140, 150, and 160, thus increasing the throughput (speed) and size of the memory system 200.
Memory system 200, however, can have disadvantages. Among these disadvantages is when processing device 50 requests data retrieval and shift spanning multiple channels, where independent shifting of data retrieved from each channel causes erroneous data to be provided to the processing device 50. This problem may commonly occur when storing and retrieving large blocks of contiguous data, particularly networking applications such as packet header processing.
FIG. 3 shows an illustration of this problem. Distributed memory 220 contains data B0-BF stored between two memory channels, data B0-B7 within memory 140 and data B8-BF within memory 150. Data retrieval spanning multiple channels is typically performed by processing device 50 issuing a first request to memory controller 110 to retrieve data B0-B7 and a second request to memory controller 120 to retrieve data B8-BF. Upon receipt of the requests, memory controllers 110 and 120 independently retrieve the corresponding data and provide it to processing device 50 in the form shown in diagram 310.
Diagram 320 shows the form data B0-BF should be provided to processing device 50 when performing data retrieval with a corresponding shift by 2, where each X denotes a filler byte of data. X bytes can typically be skipped by processing device 50 when they are inserted at the beginning or end of the retrieved data, however, when added elsewhere detrimental results may occur during processing of the retrieved data. When data retrieval spanning multiple channels with a corresponding shift by 2 is performed in memory system 200, data B0-BF is provided in the form shown in diagram 330, where X bytes are inserted in the middle of the retrieved data. Accordingly, a need remains for a method and apparatus to merge and align data retrieved by distributed memory controllers prior to providing the data to a processing device.