In a typical computer system, a microprocessor is coupled to a system memory and executes an application program such as a word processor or a communications program, stored in the memory to perform the desired function of the computer system. To execute the program, the microprocessor accesses instructions and data stored in the system memory. The speed at which the computer system executes the program is determined by the speed of the microprocessor and by the rate at which information is transferred to and from the system memory, which is known as bandwidth of the system memory. Advances in design and fabrication have enabled the processor to operate at increasingly higher speeds, while the speed of the system memory has increased at a slower rate. More specifically, the system memory typically includes a static random access memory (“SRAM”) operating at a high bandwidth and a dynamic random access memory (“DRAM”) operating at a substantially lower bandwidth. A memory controller is typically interposed between the processor and the DRAM to enable the processor to provide data requests to the controller and then perform other tasks while the controller accesses the requested data at the lower bandwidth of the DRAM. The DRAM typically has a large storage capacity and is utilized extensively by the processor during execution of a program. Thus, the bandwidth of the system memory is limited by the lower bandwidth of the DRAM, thereby limiting the speed of operation of the computer system.
A variety of approaches have been utilized to increase the bandwidth of the DRAM in the system memory. One approach is known as packetized DRAM, such as SLDRAM, in which command packets are applied to the SLDRAM to transfer data to and from the SLDRAM over a very high-speed synchronous interface. Each SLDRAM includes multiple internal banks of memory cells coupled to a wide internal data path. As understood by one skilled in the art, increasing the width of the internal data bus increases the bandwidth by transferring more data during each access of a bank. In an SLDRAM the wide internal data path enables large blocks of data in one bank to be accessed and then sequentially transferred out of the SLDRAM over the high-speed synchronous interface while a block of data in another bank is being accessed.
A second approach to increasing the bandwidth of DRAMs is known as Embedded DRAM, in which logic circuitry, such as a microprocessor, and the DRAM are formed in the same integrated circuit. In other words, the logic circuitry is “embedded” in the DRAM. By forming the DRAM and logic circuitry in the same integrated circuit, the width of an internal data path coupled between the logic circuitry and the DRAM is not limited by the number of pins that may be formed on the DRAM package. Furthermore, the length of conductive lines comprising the internal data path is significantly reduced which, in turn, reduces the capacitive delays and propagation delays of such data lines. As a result, the logic circuitry may be coupled directly to the DRAM and operate at the bandwidth of the logic circuitry. Embedded DRAMs are currently being developed for many applications requiring high bandwidth, such as networking multimedia, and high-resolution graphics systems.
In both the SLDRAM and Embedded DRAM approaches, the internal data path in each device is much wider than the data path in a conventional DRAM. When the internal data path is widened, problems result in forming various components in the device. FIG. 1 is a block diagram of a portion of a conventional DRAM 10 including a memory-cell array 12 formed in an array region 14 of a semiconductor substrate. The array 12 includes a plurality of pairs of complementary digit lines DL1, {overscore (DL1)}-DLN, {overscore (DLN)} formed in a first metal layer formed in the array region 14. A plurality of word lines WL1-WLN are formed in a polysilicon layer formed in the array region 14 and disposed substantially perpendicular to the digit lines DL1, {overscore (DL1)}-DLN, {overscore (DLN)}. A plurality of metal straps 15 are formed in a second metal layer in the array region 14, and are disposed adjacent associated word lines WL1-WLN. Each metal strap 15 is coupled to the associated one of the word lines WL1-WLN at both ends of the word line as shown. The metal straps 15 lower the resistivity of the polysilicon word lines WL1-WLN, as understood by one skilled in the art. The array 12 further includes a plurality of memory cells 16, each memory cell 16 in a respective row having an access terminal coupled to the word line WL1-WLN associated with that row, and each memory cell in a respective column having a data terminal coupled to one of the pair of complementary digit lines DL1, {overscore (DL1)}-DLN, {overscore (DLN)} associated with that column.
The DRAM 10 further includes a plurality of sense amplifiers SA1-SAN formed in a sense amplifier region 18 of the substrate positioned adjacent the array region 14. The sense amplifiers SA1-SAN are coupled to the digit lines DL1, {overscore (DL1)}-DLN, {overscore (DLN)}, respectively. Each of the sense amplifiers SA1-SAN senses and stores the data contained in an accessed memory cell 16 coupled to the associated pair of digit lines DL1, {overscore (DL1)}-DLN, {overscore (DLN)}, as understood by one skilled in the art. The sensed data stored in each of the sense amplifiers SA1-SAN is placed on an output and transferred through an associated input/output transistor 20 onto one of four input/output lines I/O1-I/O4forming a portion of an internal data path 21 of the DRAM 10. Each of the input/output transistors 20 has its gate coupled to a corresponding column select line CSEL1-CSELN coupled to column decode circuitry (not shown in FIG. 1) in the DRAM 10. Both the input/output lines I/O1-I/O4and the column select lines CSEL1-CSELN are formed in a third metal layer. The lines I/O1-I/O4are formed in a portion of the third metal layer above the sense amplifier region 18, and the column select lines are formed in a portion of the third metal layer above the array region 14. The DRAM 10 further includes row decoders 22 and 24 formed in row decode regions 26 and 28, respectively, positioned adjacent ends of the array region 14. Each of the row decoders 22 and 24 decodes a row address applied to the DRAM 10 and activates one of the word lines WL1-WLN corresponding to the decoded row address. The row decoder 22 activates the odd word lines WL1-WLN−1 and the row decoder 24 activates the even word lines WL2-WLN.
In operation, during a data transfer operation the row decoders 22 and 24 decode a row address applied to the DRAM 10 and activate the corresponding one of the word lines WL1-WLN. The memory cells 16 coupled to the activated one of the word lines WL1-WLN place their data on the corresponding pairs of digit lines DL1, {overscore (DL1)}-DLN, {overscore (DLN)}, and the sense amplifiers SA1-SAN sense and store that data, as understood by one skilled in the art. After the sense amplifiers SA1-SAN store the accessed data, the column decode circuitry decodes a column address applied to the DRAM 10 and activates corresponding ones of the column select lines CSEL1 -CSELN. In the DRAM 10, four column select lines CSEL1 -CSELN are typically activated, coupling four of the sense amplifiers SA1-SAN respectively to the four input/output lines I/O1-I/O4. For example, the column decode circuitry may activate the column select signals CSEL1 -CSEL4 turning on the I/O transistors coupled to the sense amplifiers SA1-SA4, respectively, which, in turn, couple the sense amplifiers SA1-SA4 to the input output lines I/O1-I/O4, respectively. At this point, during a read operation, the data stored in the sense amplifiers SA1-SA4 is transferred over the input/output lines I/O1-I/O4, respectively, and through respective data output buffers onto a data bus of the DRAM 10 where it is available to be read by external circuitry. During a write operation, data to be stored in the addressed memory cells is transferred from the external data bus through data input buffers (not shown in FIG. 1) and onto the input/output lines I/O1-I/O4. The data is transferred over the lines I/O1-I/O4 and through the activated transistors 20 to the sense amplifiers SA1-SA4, which, in turn, transfer the data to the addressed memory cells 16, as understood by one skilled in the art.
In the DRAM 10, there are many more column select lines CSEL1 -CSELN than there are input/output lines I/O1-I/O4. For example, the array 12 may include 1024 rows and 1024 columns, in which case there are 1024 column select lines CSEL1 -CSELN, but only four input/output lines I/O1-I/O4. The number of input/output lines I/O1-I/O4 is typically much smaller because data placed on the lines I/O1-I/O4 is typically transferred to or received from corresponding external terminals comprising the external data bus of the DRAM 10. The number of external data terminals that may be formed on the package containing the DRAM 10 is limited by the physical sizes of the terminals and the package, and is typically much less than the number of columns in the array 12. Thus the column select lines CSEL1 -CSELN and input/output liens I/O1-I/O410 are typically disposed as shown due to the respective numbers of such lines. In other words, there are many column select lines CSEL1 -CSELN so such lines are disposed above the relatively large array region 14. There is physically enough space to form the CSEL1 -CSELN above the array region 14 since the maximum number of such lines, which is illustrated in the embodiment of FIG. 1, is one column select line for each column of memory cells 16 in the array 12. In this situation, the column select lines CSEL1 -CSELN may be formed spaced adjacent the digit lines DL1, {overscore (DL1)}-DLN, {overscore (DLN)}, respectively, as shown. In contrast, the smaller number of input/output lines I/O1-I/O4 enables these lines to be formed above the sense amplifier region 18, which is typically mush smaller than the array region 14.
In the conventional architecture of the DRAM 10, there is limited space above the sense amplifier region 18 in which to form the input/output lines I/O1-I/O4. The input/output lines I/O1-I/O4 form part of the internal data path of the DRAM, and as that internal data path is made wider, it becomes increasingly difficult to form the input/output lines above the sense amplifier region 18. The size of the sense amplifier region 18 could be increased, but this would waste valuable space on the substrate in which the DRAM 10 is formed. Alternatively, additional conductive layers could be added to form the additional input/output lines I/O1-I/O4, but this solution complicates the process and increases the cost of forming the DRAM 10.
There is a need for a new data path architecture for DRAMs having ide internal data paths.