1. Field of the Invention
The present invention relates to semiconductor memory devices, and more particularly, to an arrangement of a memory array readily allowing expansion of a data bit width and enabling fast data access.
2. Description of the Background Art
FIG. 52 is a diagram schematically showing an arrangement of an array mat of a conventional semiconductor memory device (a DRAM: a Dynamic Random Access Memory) having 64-M bit storage capacity. Referring to FIG. 52, a semiconductor memory device CH includes four memory mats MTa to MTd each having 16-M bit storage capacity. Peripheral circuits PHa to PHc each are arranged in a central region among memory mats MTa to MTd. Peripheral circuit PHa includes a circuit for controlling an access operation for memory mats MTa to MTd. Peripheral circuits PHb and PHc include a data input/output circuit and an external signal input circuit for producing an internal signal for peripheral circuit PHa.
As shown in FIG. 52, the memory array is so divided into four memory mats MTa to MTd as to reduce the lengths of word and bit lines respectively arranged corresponding to a row of and a column of memory cells, thereby enabling memory cell selection operation at high speed.
FIG. 53 is a diagram showing an arrangement of memory mats MTa to MTd shown in FIG. 52. One memory mat MT is shown in FIG. 53. As shown in FIG. 53, memory mat MT is divided into sixteen memory sub blocks MSB by word line shunt regions WS in a row direction, and also divided into thirty-two memory sub blocks MSB by sense amplifier bands SAB in a column direction. A column selection line CSL is provided extending from a column decoder CD arranged on one side of memory mat MT in the column direction, and shared by memory sub blocks MSB aligned in the column direction. A column selection signal from column decoder CD is transmitted onto column selection line CSL. Here, the word line shunt region corresponds to a region where a gate electrode layer formed of a material having a relatively high resistance such as polysilicon and having the memory cells connected thereto, and a low resistance conductive layer provided above the gate electrode layer and formed, for example, of aluminum are electrically connected. By connecting the upper low resistance conductive layer and the lower gate electrode layer in word line shunt region WS, a resistance of the word line is equivalently reduced. Sense amplifier band SAB includes sense amplifier circuits arranged corresponding to the columns of memory sub blocks MSB for sensing, amplifying and latching memory cell data of corresponding columns when activated.
Memory mat MT is divided into 16.multidot.32=512 memory sub blocks MSB. Memory sub blocks MSB includes 32K memory cells arranged in 256 rows.multidot.128 columns. By dividing memory mat MT into a plurality of memory sub blocks MSB, only the memory sub block that includes a selected memory cell is driven so that current consumption is reduced. In addition, the number of memory cells connected to a bit line (the memory cell column) in the selected memory sub block is reduced so that proportionate reduction in bit line capacitance is achieved, thereby increasing a read voltage appearing on the bit line at the time of memory cell selection.
FIG. 54 is a diagram showing in further detail the arrangement of memory mat MT shown in FIG. 53. In FIG. 54, two memory sub blocks MSBa and MSBb included in memory mat MT are shown. Local I/O line pairs LIOa to LIOd are provided which are shared by two memory sub blocks MSBa and MSBb aligned in the row direction. Local I/O line pairs LIOa to LIOd extend along memory sub blocks MSBa and MSBb in the column direction, and another local I/O line pair is provided for an adjacent memory sub block, which is not shown in the drawing.
Word line shunt region WS is provided between memory sub blocks MSBa and MSBb, in which main I/O line pairs MIOa to MIOd are arranged. These main I/O line pairs MIOa to MIOd are shared by memory sub blocks aligned in the column direction. The local I/O line pairs provided for the memory sub block selected by the row decoder, not shown, are connected to the main I/O line pairs.
A column selection signal from a column decoding circuit CDK included in column decoder CD is transmitted onto column selection line CSL arranged over the memory sub blocks in the column direction. Column selection line CSL is shared by the memory sub blocks aligned in the column direction, for transmitting the column selection signal to the memory sub blocks. Main amplifiers MAPa to MAPd are respectively provided for main I/O line pairs MIOa to MIOd, for performing amplification of data on the main I/O line pairs when activated.
In the arrangement shown in FIG. 54, when column selection line CSL is selected, the memory cells corresponding to four columns are selected in memory sub block MSBb. Thereafter, the columns are respectively connected to local I/O lines LIOa to LIOd, and then to main I/O line pairs MIOa to MIOd.
Eight such arrangements shown in FIG. 54 are provided in the row direction. In the case of a four way method where four sense amplifiers (columns) are selected by a single column selection line CSL and connected to the local I/O line pairs, eight memory sub blocks are connected to the main I/O line pairs through the local I/O line pairs in sixteen memory sub blocks, and 4.multidot.8=32-bit data can be simultaneously selected in total. In this case, 2K(=128.multidot.16) sense amplifiers are provided in the row direction and memory cell data is latched at each sense amplifier. Therefore, 32-bit data designated by a column address is selected out of 2-Kbit data.
As shown in FIG. 52, four memory mats MT are provided in the semiconductor memory device, so that 128-bit word data can be transferred in a chip. When the mat is used as a bank, data of 32-bit word per bank is input/output.
FIG. 55 is a diagram schematically showing an arrangement of a mat of a conventional 256-M bit DRAM. Referring to FIG. 55, the DRAM includes memory mats MT0 to MT7 aligned in the row direction and memory mats MT8 to MT15 arranged facing to memory mats MT0 to MT7. Peripheral circuitry PH is provided in the row direction between memory mats MT0 to MT7 and MT8 to MT15. Peripheral circuitry PH includes a data input/output circuit and an address signal and external control signal input circuit.
Each of memory mats MT0 to MT15 has 16-M bit storage capacity. Main word driver groups MWD are arranged between the adjacent memory mats. In the 256-M bit DRAM, in order to drive word lines into a selected state at high speed, a hierarchical structure of main and sub word lines is employed. A memory cell is connected to the sub word line, but not to the main word line. A sub word line driver is arranged between the main and sub word lines. The main word line of small load capacitance is driven into the selected state at high speed and, responsibly, the sub word line provided at the remote end thereof is also driven into the selection state at high speed.
Each of memory mats MT0 to MT15 shown in FIG. 55 is divided into a plurality of memory sub blocks MSB in the row and column directions as in the case of the above mentioned memory mats shown FIGS. 53 and 54. The memory sub block is increased in size with the provision of the sub word line drive circuit, so that it has 128-Kbit (512 rows.multidot.256 columns) storage capacity, or a double storage capacity of the arrangement shown in FIGS. 53 and 54. Thus, memory mat MT (MT0 to MT15) is divided into eight memory sub blocks in the row direction. The arrangements of the local I/O line pairs and global I/O line pairs are the same as those shown in FIG. 54. Accordingly, 4.multidot.4=16-bit memory cells are selected in a single mat for data input/output. When the DRAM shown in FIG. 55 has a four-bank structure, data of 64-bit word per bank can be input/output as a single bank is formed of four memory mats.
In the conventional semiconductor memory device (DRAM), data input/output is performed through the main I/O line pairs arranged in the word line shunt regions or the sub word line driver regions. To perform data transfer at high speed between the DRAM and a processor provided outside the DRAM, desirably, the data bit width is as wide as possible. A number of data words can be transferred in a single data transfer cycle. In the conventional DRAM, however, the bit width of the data word cannot be sufficiently wide.
Generally, an SDRAM (a Synchronous DRAM) has four banks, an RDRAM (a Rambus DRAM) with sixteen banks and an SLDRAM (a Sync Link DRAM) with eight banks. With these multiple-bank structures, increase in access time is prevented, which increase is caused by driving a selected word line into a non-selected state and then driving another word line (row) into a selection state at the time of page boundary access. More specifically, successive access is accomplished without any increase in access time at the time of page switching by sequentially driving a plurality of banks into an active state in an interleaved manner and accessing another bank at the time of page switching. If a large number of banks are provided, even when a plurality of clock cycles are required for row selection operation, bank access is performed for each clock cycle by sequentially driving the banks into the selected state, so that the DRAM can be operated in accordance with a clock signal at high speed.
Therefore, recently, the number of banks provided in the semiconductor memory device has been on the increase. When the number of banks increases, the number of memory cells allotted to a single bank is reduced, whereby allotting of a sufficient bit width to a single data becomes difficult (when the bit width increases, the number of words stored in a single bank is reduced). When the DRAM shown in FIG. 55 has a four-bank structure, 64-bit word data can be accessed for a single bank. In the case of an eight-bank structure, however, the memory cells are only selected from two memory mats, so that no more than 32-bit word data can be accessed for one bank.
To achieve high speed data transfer between the processor and the DRAM, generally, a plurality of data are internally prefetched. The prefetched data are sequentially transmitted in synchronization with a clock signal.
FIG. 56 is a diagram schematically showing an arrangement of a data reading portion of a conventional clock synchronous semiconductor memory device (a synchronous DRAM). Referring to FIG. 56, a plurality of data simultaneously read from a memory bank BK are latched at a latch circuit LK. Latch circuit LK sequentially applies the latched data to a transfer circuit TK in accordance with a selection signal .phi.sel. Transfer circuit TK transfers data applied from latch circuit LK in synchronization with a clock signal CLK for outputting through an output circuit which in turn is not shown in the drawing.
Generally, the longest time is required for a path through which data is transferred from a sense amplifier in memory bank BK (one or a plurality of memory mats) to latch circuit LK, or a path for transmitting data latched at the sense amplifier to a main amplifier through an internal main I/O line pair. By simultaneously transferring a plurality of data to latch circuit LK and sequentially selecting and outputting them in synchronization with clock signal CLK, the longest delay due to data transfer from memory bank BK to latch circuit LK is apparently avoided. A number of data are desirably latched at latch circuit LK and sequentially transferred in order to increase the speed of data transfer. Especially in the recent clock synchronous DRAM, data input/output is performed in a DDR (Double Data Rate) method and external data is transferred in synchronization with both rising and falling edges of a clock signal. When data transfer is performed at a doubled speed of such clock signal, for example, two latch circuits LK are provided so that data are alternatively transferred to and latched at the latch circuits and then sequentially transferred. In this way, data transfer at high speed can be achieved without being affected by the path associated with the longest delay. In such high speed data transfer, in order to prefetch a number of data, the bit number of data word is limited as the number of main I/O line pairs remains constant. Therefore, the bit number of data word to be externally transferred in one clock cycle is limited and high speed transfer cannot be achieved.
In addition, although the development of the processing technique has allowed fine design rule, chip size is increasingly becoming large due to increase in storage capacity. In order to make the chip size as small as possible to achieve reduction in cost, division units of bit and word lines are increased for larger storage capacity of a memory sub block and the number of circuits other than the memory cells are reduced to achieve reduction in area occupied by a peripheral circuit. The increase in division units of bit and word lines equals to the reduction in number of the regions between the blocks, that is, the regions in which the main I/O line pairs are arranged. Therefore, even when the size of the memory mat, the number of sense amplifiers activated at a time and the data bit number latched at a time are increased, the data bit number which is allowed to be transferred outside the memory array is limited by the number of main I/O line pairs. To solve this problem, the number of interconnection lines can be increased by increasing the area for arranging the local and main I/O line pairs, that is, the layout area for the sense amplifier band, sub word line dliver, and such. In this case, however, it is apparent that the increase in chip size is inevitable due to the increase in layout area.
Therefore, the problem associated with the array arrangement of the conventional semiconductor memory device is that the data word with a sufficient bit width cannot be generated and high speed data transfer cannot be achieved.
Further, when storage capacity is increased, the size of the memory mat also increases in the column direction. Referring now to FIG. 57, a memory sub block MSB0 arranged farthest away from a column decoder and a memory sub block MSBn closest to the column decoder will be considered. These memory sub blocks MSB0 and MSBn are aligned in the column direction and shares a column selection line CSL transmitting a column selection signal from the column decoder, which column decoder is not shown in the drawing. Local I/O line pairs LIO0 and LIOn are respectively provided for memory sub blocks MSB0 and MSBn, and a main I/O line pair MIO is so arranged as to be shared by memory sub blocks MSB0 and MSBn. A main amplifier MAP and a write driver WRD are provided for main I/O line pair MIO.
In the synchronous semiconductor memory device, data reading/writing is performed by simultaneously applying a read/write command and a column address. In writing, data write driver WRD is activated for transmitting write data to main I/O line pair MIO. In writing data to memory sub block MSB0, data from write driver WRD is transmitted to a sense amplifier provided in memory sub block MSB0. Similarly, in writing data to memory sub block MSBn, write data from write driver WRD is transferred to a sense amplifier provided in memory sub block MSBn. A signal propagation delay Td is caused by interconnection line capacitance and resistance of main I/O line pair MIO. When the memory mat is increased in size in the column direction due to increase in storage capacity, there would be a proportionate increase in length of the main I/O line pair and in signal propagation delay Td. As the time required for data writing is determined in consideration of the worst case, write operation cannot be completed unless at least delay time Td is elapsed after write driver WRD is activated (actually, signal propagation delay caused between the sense amplifier and the main I/O line pair is also considered). A command designating next operation mode must be applied after the data writing is completed. This is the same in the case where read data is transmitted to main amplifier MAP at the time of data reading and the time required for data reading is determined by delay time Td, whereby the length of so-called CAS latency cannot be reduced. This means that the signal propagation time for a transfer path from memory bank BK to latch circuit LK is increased, not enabling high speed data reading.