Basically, a DRAM is an integrated circuit that stores data in binary form (e.g., “1” or “0”) in a large number of cells. The data is stored in a cell as a charge on a capacitor located within the cell. Typically, a high logic level is approximately equal to the power supply voltage and a low logic level is approximately equal to ground.
The cells of a conventional DRAM are arranged in an array so that individual cells can be addressed and accessed. The array can be thought of as rows and columns of cells. Each row includes a word line that interconnects cells on the row with a common control signal. Similarly, each column includes a bit line that is coupled to at most one cell in each row. Thus, the word and bit lines can be controlled so as to individually access each cell of the array.
To read data out of a cell, the capacitor of a cell is accessed by selecting the word line associated with the cell. A complementary bit line that is paired with the bit line for the selected cell is equilibrated to an equilibrium voltage. This equilibration voltage (Veq) is typically midway between the high Vdd and low Vss (typically ground) logic levels. Thus, conventionally, the bit lines are equilibrated to one-half of the power supply voltage, Vdd/2. When the word line is activated for the selected cell, the capacitor of the selected cell discharges the stored voltage onto the bit line, thus changing the voltage on the bit line. A differential amplifier, conventionally referred to as a sense amplifier, is then used to detect and amplify the difference in voltage on the pair of bit lines.
In order to comply with area constraints of a memory, a stacking technique, so-called “staggering” technique, is conventionally used to take into account the pitch difference in between the sense amplifier and the cells. Several sense amplifiers are therefore staggered one behind each other in the longitudinal direction of the bit lines. However, this architecture suffers that a bit line and its complementary run over all the staggered sense amplifiers. This leads to a congestion of the space available as metal-0 (metal used for the bit lines) indeed covers 100% of the sense amplifiers. Moreover, addressing a specific cell of the memory necessitates row and column address buses built from metal tracks, generally metal-1 tracks. When 64 column address buses are used to decode the sense amplifiers of the sense amplifier array, around 100 metal-1 tracks need to be present for power supplies, control commands, I/Os and decoding (64 tracks for this latest group). But in the near future, there needs to be a lot of focus of the core circuits of a DRAM, especially on the sense amplifier. Indeed, with introduction of FDSOI (Fully Depleted Silicon On Insulator) technology or introduction of high-k/metal gate, devices will get smaller and the metal lines could become the limiting factor, not any more the size of the devices. It is therefore understood that 100 metal-1 tracks are far too many.
FIG. 1 shows a memory architecture that helps limiting the available space congestion by dividing the memory cells array into sub-arrays MC0, MC1, MC2, MC3, by splitting the sense amplifiers into pairs of staggered sense amplifier banks and by providing the bit lines according to an interleaved arrangement so that they alternate in the lateral direction of the word lines WL between a bit line BL0, BL2 coupled to a sense amplifier SA0, SA2 of the first bank of the pair and a bit line BL1, BL3 coupled to a sense amplifier SA1, SA3 of the second bank of the pair. The alternative arrangement of the bit lines result in interconnect spaces available in each sense amplifier bank of the pair parallel to the bit lines. With this alternative arrangement, metal-0 now covers only 50% of the sense amplifiers. With relaxed constraints on the sense amplifiers, the layout is easier.
On FIG. 1, only relevant signals are represented for clarity:                Row decode signals φPCH running in the X direction and using metal-1 are used to address a line of sense amplifiers;        Column decode signals running on CoLumn Select lines (CSLi, CSLj) in the Y direction (column decode) and using metal-2 are used to address a column of sense amplifiers;        Local Input/Output lines (LIO and its complementary LIO) using metal-1 are used to transfer the data sensed and amplified from the sense amplifiers to Global Input/Output lines (GIOm, GIOn and their complementary GIOm, GIOn) running perpendicularly to the Local Input/Output lines and using metal-2. The length of the Local Input/Output lines (i.e. number of sense amplifiers tied on) depends on layout constraints, staggering, metal-2 pitch rules, circuit specification, etc.        
Each CoLumn Select line (CSLi, CSLj) decodes a column of sense amplifiers in banks that are on the path. The selected sense amplifiers SA0, SA1, SA2, SA3 provide a valid behavior (read or write), while the half selected ones SA4, SA5 remain in HZ state (high impedance) and do not disturb the Global Input/Output lines except for being extra parasitics to be loaded/unloaded.
The data present on the Global Input/Output lines enters into all the Local Input/Output lines and therefore a precharge has to be done at the beginning of the following access to insure proper sensing and refresh. It cannot be anticipated. Considering the number of sense amplifiers and the total metal length (Global and Local Input Output lines), a significant power can be dissipated then.
In addition, a conventional sense amplifier fabricated in bulk silicon CMOs technology is made of eleven transistors and thus increases the surface area of the entire circuit.
Several solutions are possible to overcome the parasitic issues and possible power peaks.
According to a first solution, a local decoder (references to as switch S on FIG. 1) can be added between Local I/O lines and Global I/O lines. In that case, the unselected Local I/O lines remain undisturbed by the Global I/O lines and can be precharged in advance allowing very fast cycle times.
According to a second solution, a decoder, that can be as simple as for instance a NOR or a NAND gate, can be added between a CoLumn Select line and the row decode signal φPCH. With this second solution, the content of the half-selected sense amplifiers remains unaffected by the Local I/O lines. The load along the CoLumn Select lines can also be reduced (the decoder being used as a local signal booster) while the cycle time may be improved. This second solution is in particular described in the French patent application no 1152256 filed by the Applicant on Mar. 18, 2011 and not yet published.
Both first and second solutions can be applied simultaneously which afford for very good performances but may not be optimal on the layout point of view. Indeed, the only possible location for these decoders is immediately next to the sense amplifiers (or even into the sense amplifier layout) which introduces an “irregular” layout in a very sensitive region.