1. Technical Field
The present application relates generally to an improved data processing system and method. More specifically, the present application is directed to a method for supporting partial cache line read and write operations to a memory module to reduce read and write data traffic on a memory channel.
2. Description of Related Art
Contemporary high performance computing main memory systems are generally composed of one or more dynamic random access memory (DRAM) devices, which are connected to one or more processors via one or more memory control elements. Overall computer system performance is affected by each of the key elements of the computer structure, including the performance/structure of the processor(s), any memory cache(s), the input/output (I/O) subsystem(s), the efficiency of the memory control function(s), the main memory device(s), and the type and structure of the memory interconnect interface(s).
Extensive research and development efforts are invested by the industry, on an ongoing basis, to create improved and/or innovative solutions to maximizing overall system performance and density by improving the memory system/subsystem design and/or structure. High-availability systems, i.e. systems that must be available to users without failure for large periods of time, present further challenges related to overall system reliability due to customer expectations that new computer systems will markedly surpass existing systems with regard to mean-time-before-failure (MTBF), in addition to offering additional functions, increased performance, increased storage, lower operating costs, etc. Other frequent customer requirements further exacerbate the memory system design challenges, and include such items as ease of upgrade and reduced system environmental impact, such as space, power, and cooling.
Furthermore, with the movement to multi-core and multi-threaded processor designs, new requirements are being made for the memory subsystem to supply very large data bandwidths and memory capacity into a single processor memory module socket. At a system level, the bandwidth and memory capacity available from the memory subsystem is directly proportional to the number of memory modules that can be supported by the processor pin counts, the frequency at which the processor pins operate, and how efficiently the processor pins are used to transfer data. That is, the memory modules connect to the processor through a memory interface bus and memory module sockets, which may also be called a memory channel. The memory module sockets are comprised of pins that connect to the pins located on a common edge of a memory module. Thus, the number of pins or pin count of the memory modules and the pin count of the memory module's sockets, which are connected to the processor, defines the bandwidth and memory capacity of the memory system.
For high bandwidth memory systems, multiple hub based memory modules and/or multi-ported hub based memory modules may be used in the memory system to generate bandwidth to fill up the high bandwidth memory channel. With a memory system that uses multiple hub based memory modules and/or multi-ported hub based memory modules, the total amount of bandwidth that is available on the memory modules may be significantly higher than the bandwidth available on the memory channel. Thus, the memory channel presents a limiting factor, or bottleneck, for the flow of data to/from the memory modules of a memory system.