Under a typical computer system architecture, during read and write cycles, a data storage controller (DSC) controls access to data storage units (DSUs) that comprise system memory with addressable memory locations, which are generally within a continuous range of predefined logical addresses. For accessing the system memory, the DSC processes read and write requests generated by an instruction processing unit (IPU) executing a program that requests data to be read from or written into a particular memory location. Upon receipt of the requests, the DSC initiates corresponding read or write cycles over a memory bus, for accessing the addressed memory locations. The rate by which data is transferred, i.e., the data throughput, during each memory cycle is dependent on the bus speed as well as the width of the system's data bus and the length of a memory location, which is defined in terms of data bits, for example, 8-bit, 16-bit, or 32-bit, etc.
Each memory cycle, read or write, expends a certain number of clock cycles. Because the performance of a computer system is highly dependent on the data throughput, it is necessary to maximize the data transfer rate over the memory bus, ideally, making it reach the full system clock speed. Various techniques have been devised to increase the data throughput by minimizing the time required to access the system memory. For example, under an scheme known as interleaved memory access, each DSU is addressable over a corresponding internal bus that has an assigned range of physical addresses. In this way, a memory accesses over one internal bus can start ahead of completion of a prior access over another internal bus, provided that the memory bus bandwidth supports the execution of parallel memory accesses over the internal busses. Usually, under the interleaved memory access arrangement, a separate memory management unit (MMU) maps the logical memory addresses into the physical addresses.
Although the interleaved write scheme improves data throughput for writing into separate physical addresses, sometimes the execution of a program requires writing data to the same address, with each write request sometimes modifying different portions of the memory address. For example, two back-to-back writes to a two-word (32 bits) address location may modify a first nibble (4 bits) in bit positions 0-3, and a third nibble in bit positions 16-19. Under a conventional arrangement two write cycles must be initiated to service the write requests to the same address. It is, however, desired to reduce the number of write cycles to different portions of the same memory address, in order to increase data throughput for accessing the system memory.