The present invention relates to semiconductor memories, and more particularly to a memory architecture and busing scheme which improve memory access time and reduce die size.
FIG. 1 shows a block diagram of a conventional semiconductor memory 200, such as a dynamic random access memory (DRAM). Memory 200 has two banks B0, B1 of two memory arrays each (Arrays 1, 2 and 3, 4). Each bank has a dedicated I/O block 240-1, 240-2 for data transfer into and out of memory 200. Each of arrays 1-4 has a dedicated global column decoder 250-1 to 250-4 for column selection.
Memory 200 operates as follows. An externally provided column address COL ADD is delivered via address bus 260 to each of the four global column decoders for selection of a predetermined number of columns from each bank. Simultaneously, an externally provided row address (not shown) is provided to row decoders (not shown) for selection of a row from each bank. Each of the four arrays is divided into a number of sub-arrays (not shown) with local bitline sense amplifiers (not shown) and column selection circuitry (not shown) between the sub-arrays. In selecting columns, each global column decoder provides decoded column address signals on global column decoder (GCD) lines which extend across each array. The GCD lines are coupled to the column select circuitry located between the sub-arrays. An output of the column select circuitry is coupled to global data bus (GDB) lines for transferring data between the array and the I/O block.
Memory 200 further includes first and second timing circuits 220, 230 which control the selection and transfer of data in memory 200. The first timing circuit 220 receives a set of control signals CTRL, such as the DRAM clock signal {overscore (CAS)}, and generates a first timing signal which is coupled to the four global column decoders and the second timing circuit 230 via interconnect line 210. A column access operation is initiated when the first timing circuit issues the first timing signal to selectively enable the global column decoders. At the same time, the first timing signal is delivered to the second timing circuit 230. As data selection and transfer through the memory banks takes place, second timing circuit 230 issues a second timing signal coupled to the two I/O blocks via interconnect lines 270-1, 270-2. The second timing signal is used to strobe the I/O blocks for sampling and transferring data through the I/O blocks.
Second timing circuit 230 is ideally designed to issue the second timing signal at substantially the same time that array data becomes available at the input of the I/O blocks in a read operation. Issuing the second timing signal earlier than the arrival of array data at the input of I/O blocks will result in sampling the wrong data. Issuing the second timing signal later than the arrival of array data will result in slow column access time. Accordingly, the second timing circuit needs to be carefully designed to issue the second timing signal at just the right time. Further, to avoid sampling and transfer of incorrect data, the second timing signal must be synchronized with the slowest column path. This requires very careful study of the layout of memory 200 to identify the slowest column path, and accurate extraction of all interconnect parasitic resistance and capacitance. With the second timing circuit designed for the worst case column speed path, normally fast column paths will externally appear slow, and thus advantage can not be taken of the faster speed paths.
Another drawback of the conventional memory 200 is that the interconnect lines 270-1, 270-2 coupling the second timing circuit 230 to the I/O blocks extend across a full length of an array. Because of the RC delay in these interconnect lines, a skew is created between the times that the left-most and the right-most I/O circuits in each I/O block are strobed. This timing skew results in slower column access time.
Therefore, a memory configuration and busing scheme which eliminate the above-mentioned timing problems causing speed degradation, and provide other improvements, is desired.
In accordance with the present invention a memory architecture and busing scheme improve speed by providing tracking between time-critical signals and reducing timing skews in data transfers through the I/O blocks, and reduce die size.
In one embodiment, a memory includes an array of memory cells, an address decoder configured to generate a decoded signal for selecting a plurality of memory cells in a memory access, an input/output block configured to transfer data corresponding to the selected memory cells into and out of the memory, a first timing circuit configured to generate a first timing signal, and a second timing circuit configured to receive the first timing signal and in response generate a strobe signal coupled to the input/output block. An interconnect line carrying the first timing signal is routed through the array so that in the memory access a time delay from when the decoded signal is generated to when the data arrives at an input terminal of the I/O block is substantially the same as a time delay from when the first timing signal is generated to when the strobe signal is generated.
In another embodiment, the address decoder and the first timing circuit are located along one end of the memory, and the I/O block and the second timing circuit are located along an end of the memory opposite the first end.
In yet another embodiment, the I/O block has at least a first and a second I/O circuits each being configured to transfer one bit of the data. The second timing circuit is located between the first and second I/O circuits so that a time delay through an interconnect line coupling the strobe signal to the first I/O circuit is the same as a time delay through another interconnect line coupling the strobe signal to the second I/O circuit.
The following detailed description and the accompanying drawings provide a better understanding of the nature and advantages of the present invention.