Although many modern processors can operate on relatively small units of data, transfers of data to and from a memory are often limited to relatively coarse data units of eight bytes, sixteen bytes or more. The smallest unit of data that can be transferred to a memory in a single transfer operation defines the word-size of the memory. Generally, it is not a problem to read a unit of data smaller than a memory word (i.e., sub-word data). A complete memory word is read from memory and the sub-word data of interest is extracted by the memory controller or the requesting agent. Difficulty arises, however, when attempting to write sub-word data, because the sub-word data cannot be written to memory without writing an entire memory word. Unless the portion of the memory word excluding the sub-word data to be written matches the contents of memory at the target address, valid data may be overwritten and lost.
Two different types of write operations can be used to write sub-word data without undesirable side effects: merged write operations and masked write operations. In a merged write operation, a memory word is read from target memory region, then the sub-word data to be written is substituted for a portion of the memory word. This substitution is called a merge operation and produces a merged memory word. Because the portion of the merged memory word exclusive of the substituted sub-word data contains the same data as the target memory region, only the sub-word data is written, in effect, when the merged memory word is stored in the target memory region.
One disadvantage of merged write operations is that it is usually necessary to read the contents of the memory region to be written in order to produce the merged memory word, thereby requiring both a read memory access and a write memory access to complete a single merged write operation. This tends to significantly reduce the rate at which data can be written to memory.
In a masked write operation, mask information is transferred to the memory to indicate portions of the memory word that should not be written (i.e., masked portions of the memory word). Logic within the memory interprets the mask information and stores only the unmasked portion of the memory word.
FIG. 1 is a diagram of a prior art memory system 10 that can be used to perform a masked write operation. The memory system 10 includes a memory controller 12 and a memory 14 coupled to one another by a communication channel 13. To initiate a masked write operation (e.g., in response to a request for a sub-word write from a processor or other agent), the memory controller 12 first issues a write command (CMD) 15 to the memory 14 indicating that a word of data is to be written in the memory 14 at a write address specified in the command 15. Next the memory controller 12 transfers the memory word to the memory device. In his case, the memory word is a data packet 16 containing eight data values, Data0-Data7, that are transferred in sequence across the communication channel 13. Lastly, the memory controller 12 transfers mask data 17 to the memory 14 to indicate which bytes within the data packet 16 are masked and which bytes are unmasked. In this case, the mask data 17 is an eight-bit value, with each of bits M0-M7 indicating whether a respective one of the eight data values, Data0-Data7, is masked or unmasked. Logic within the memory 14 stores the unmasked data values at addresses in memory 14 that are offset from the write address according to the respective positions of the data values within the data packet 16. For example, an unmasked data value in position zero of the data packet 16 is stored at the write address, an unmasked data value in position one of the data packet 16 is stored at the write address plus one, and so forth so that, generally, an unmasked data value in position N of the data packet 16 is stored at the write address plus N.
One drawback to the masked write scheme illustrated in FIG. 1 is that the mask data 17 consumes bandwidth on the communication channel 13. This is true regardless of the order in which the command 15, data packet 16 and mask data 17 are transferred, and regardless of whether the communication path 13 is wide enough to support concurrent transfer of the command 15 and the mask information 17. The mask data 17 still consumes bandwidth that could otherwise be used to transfer other commands. This is significant, because memory 14 typically includes multiple memory devices coupled to the communication channel 13, so that availability of the communication channel 13 becomes a performance-limiting characteristic. Consequently, reducing the communication channel bandwidth consumed by masked write operations can substantially improve the performance of the memory system.