As the operating speed of processor has increased and multi-core processors have been introduced, data throughput of processor has been increased. However data throughput of system memory devices, such as dynamic random access memory (“DRAM”), hasn't been increased as fast as that of processors so that the performance of computer system is now limited by data throughput of system memory.
To increase data throughput of system memory devices, various attempts have been made. For example, multi-channel system memory buses have been used to double or triple the bandwidth. Multi-channel system memory buses require increasingly complex printed circuit board (PCB) design and can increase interference between buses.
It has been proposed to stack several memory device dice and a logic die in the same package as in FIG. 1. The processor is connected directly to a logic die via a relatively narrow high speed two way bus. The logic die in turn is connected to the memory devices, here Dynamic Random Access Memory (DRAM) through wide low speed busses.
FIG. 2 is an illustration of the typical architecture of memory devices used in FIG. 1. Each memory device is divided into 16 partitions and each partition includes several banks. The partitions of each bank are stacked on top of each other through wide busses. One proposal is to implement the wide busses with Through Silicon Vias (TSVs). Each set of stacked partitions may be referred to as a vault. The vaults may be independently accessed for read and write operations.
A problem that may arise with the FIG. 2 architecture is the creation of timing signal skews between the signals transmitted from each of the memory devices. Because the distances between each of the memory devices and the logic die are different for each memory device dice, the time required for signals to be transmitted from each of the memory device dice will be different. Additionally, because of process, supply voltage and temperature variations, the timing performances of memory devices may vary.
FIG. 3 illustrates the signal skews resulting from 4 stacked DRAM modules, DRAM 0-3. The logic die will only capture valid data from the hatched area where all the data from all four DRAMS overlap. The data valid period for each of the memory devices is large enough for the logic die to capture the read data from each individual die. However the composite data for all memory device dice is significantly reduced. The result is a greatly reduced throughput of data. Accordingly there is a need in the industry for a stacked memory device with increased throughput.