FIG. 1A shows a block diagram of a conventional semiconductor memory architecture 10 commonly used in implementing different types of memories such as volatile memories (e.g., static random access memory (SRAM), dynamic random access memory (DRAM)) and nonvolatile memories (e.g., read only memory (ROM), erasable programmable ROM (EPROM), electrically erasable program ROM (EEPROM), Flash EPROM). Such memories, as shown in FIG. 1A, typically include an array 12 of 2N rows of cells by 2M columns of cells, where N and M represent the number of row and column addresses, respectively. A cell is selected from array 12 via row decoder 14 and column decoder 16. Row decoder 14 receives row addresses A0-AN for selecting one of the 2N rows, and simultaneously, column decoder 16 receives column addresses AN+1-AN+M for selecting one of the 2M columns. The selected cell is located at the intersection of the selected row (wordline) and column (bitline).
In a read operation, a signal representing the stored data is transferred from the selected cell to a sense amplifier in block 18 via column decoder 16. The sense amplifier amplifies the cell signal, and transfers it to an output buffer (not shown) which in turn transfers it to IO pad 19 for external use. In a write operation, programming data is externally provided on IO pad 19, and is then transferred to the selected cell via a data IO circuit in block 18 and column decoder 16. Blocks 12, 16, 18 and IO pad 19 may be repeated a number of times depending upon the desired IO data configuration (e.g., by-16 or by-32 data).
The address access time in a read operation (and a write operation for SRAMs and DRAMs) typically consists of time delays through an address buffer (not shown), row decoder 14, memory array 12, column decoder 16, sense amplifier 18, and output buffer (not shown). Of these delays, depending on the memory density, the delay through the memory array typically represents the largest portion of the total time delay because of the RC time constant associated with the long wordlines and the high capacitance associated with the long bitlines. Thus, in a given process technology (e.g., 0.13 xcexcm), to achieve high speed, array 12 is typically divided into two or more sub-arrays, thereby reducing the length of wordlines and/or bitlines. An example of such memory configuration is shown in FIG. 1B.
In FIG. 1B, the memory array is divided into four sub-arrays 12-1, 12-2, 12-3, and 12-4 thus reducing the length of each wordline by a factor of four. However, such division of the array requires the duplication of some of the circuit blocks interfacing with the array. For example, four sets of row decoders 14-1, 14-2, and 14-3 are needed as shown. To reduce the bitline length by one half, each sub-array 12-1 through 12-4 would need to be divided into two, with the column decoder block 16 and block 18 (which includes the sense amplifiers and data I/O circuits) being duplicated. Such duplication can result in unnecessary die size increase if not properly implemented. Further, for very high-performance (e.g., high speed, low power), high-density memories wherein a large number of array divisions is used to achieve the speed targets, there may be diminishing returns on the speed after a certain number of array divisions, and there certainly would be a large power penalty associated with every level of array division. This is due to the large amount of duplication of the array-interface circuitry which leads to highly capacitive nodes in speed-sensitive circuit paths. To quickly switch such high-capacitance nodes, large drivers are required which consume substantial dynamic power. This has substantially hindered the cost-effective development of high-speed, low-power, high-density memories for such popular memory applications as portable devices.
The conventional memory configurations of FIGS. 1A and 1B suffer from a number of other drawbacks. First, the address access time is non-uniform across the array depending on both the access path (i.e., row or column) and the physical location of the cell in the array. Typically, the row access path is slower than the column access path because of the presence of the wordline RC delay in the row access path. Also, within the row access path, the cells have different access times depending on the location of the selected cell along the row. For example, the cell located closest to the wordline driver has a faster access time than the cell located furthest from the wordline driver. These non-uniformities in address access time result in complications in both the use of memories as well as their design.
Another drawback is the inefficient use of redundancy. Commonly, redundant blocks of rows and/or columns of cells are added in the array to enable replacement of defective cells with redundant cells. However, often, due to design constraints, a redundant block of rows or columns is used to replace a row or column having only one or few defective cells, thus resulting in inefficient use of the available redundant cells.
Thus, a memory configuration which yields high speed and low power, results in a more efficient use of redundancy, enjoys a relatively uniform address access time for all memory cells, is easily scalable to higher memory densities with minimal speed and power penalties, and is memory-type independent, is desirable.
In accordance with one embodiment of the present invention a semiconductor memory includes a first array block having at least two sub-array blocks and a first interconnect routing channel through which a first group of local interconnect lines extend. Each of the two sub-array blocks includes at least two lower-level sub-array blocks and a second interconnect routing channel through which a second group of local interconnect lines extend. The first group of local interconnect lines are configured to carry input information for accessing memory locations in which to store data or from which to retrieve data. The second group of local interconnect lines are configured to carry a subset of the input information.
In another embodiment, the semiconductor memory further includes a first higher-level array block including at least said first array block and a second substantially similar array block and a third interconnect routing channel through which a third group of local interconnect lines extend. The third group of local interconnect lines are configured to carry a superset of the input information.
In another embodiment, the first group of local interconnect lines extends orthogonally to the second group of local interconnect lines.
In another embodiment, the first interconnect routing channel extends a longer distance than the second interconnect routing channel.
In another embodiment, the first interconnect routing channel is located between the two sub-array blocks, and the second interconnect routing channel in each of the two sub-array blocks is located between the corresponding two lower-level sub-array blocks.
In another embodiment, each lower-level sub-array block comprises a plurality of memory cell array blocks each having a plurality of memory cells arranged along a predesignated number of rows and columns. First and second adjacent memory cell array blocks in each lower-level sub-array block are coupled to a data transfer block configured to selectively transfer data to or from selected ones of the plurality of memory cells in one or both of the first and second adjacent memory cell array blocks.
In another embodiment, each lower-level sub-array block further comprises a plurality of data lines extending over the corresponding memory cell array blocks, the data lines being coupled to the data transfer block so that in a memory access operation data is transferred between the data lines and one or both of the first and second memory cell array blocks via the data transfer block.
In another embodiment, the data transfer block includes a plurality of sense amplifiers and a column multiplexer configured to selectively transfer data from selected ones of the plurality of memory cells in one or both of the first and second adjacent memory cell array blocks to the plurality of sense amplifiers. The plurality of sense amplifiers are coupled between the column multiplexer and the data lines.
In accordance with another embodiment of the present invention, a method of forming a semiconductor memory having a plurality of memory cells includes the following acts. A first array block is formed, which includes at least two first-lower-level (1LL) blocks separated by a first interconnect routing channel through which a first group of local interconnect lines extend. At least two second-lower-level (2LL) blocks are formed in each of the at least two 1LL blocks. The two 2LL blocks are separated by a second interconnect routing channel through which a second group of local interconnect lines extend orthogonally to the first group of interconnect lines. At least two third-lower-level (3LL) blocks is formed in each of the at least two 2LL blocks. The two 3LL blocks are separated by a third interconnect routing channel through which a third group of local interconnect lines extend orthogonally to the second group of interconnect lines. The first group of local interconnect lines are configured to carry input information for accessing one or more of the plurality of memory cells. The second group of local interconnect lines are configured to carry a subset S1 of the input information. The third group of local interconnect lines are configured to carry a subset S2 of the subset S1 of the input information.
In another embodiment, the method further includes forming a first higher-level array block. the first higher-level block includes at least the first array block and a second array block. The second array block is substantially similar to the first array block. The first and second array blocks are separated by a fourth interconnect routing channel through which a fourth group of local interconnect lines extend orthogonally to the third group of local interconnect lines. The fourth group of local interconnect lines are configured to carry a superset of the input information.
In another embodiment, the method further includes forming a plurality of memory cell array blocks in each of the at least two 3LL blocks. Each memory cell array block has a plurality of memory cells arranged along a predesignated number of rows and columns. A first and a second adjacent memory cell array blocks in each of the at least two 3LL blocks are coupled to a data transfer block configured to selectively transfer data to or from selected ones of the plurality of memory cells in one or both of the first and second adjacent memory cell array blocks.
In accordance with yet another embodiment of the present invention, a method of forming a semiconductor memory includes the following acts. A first array block is formed which has a plurality of memory cell array blocks each having a plurality of memory cells arranged along a predesignated number of rows and columns. A first higher-level-1 (HL1) block is formed. The first HL1 block includes at least the first array block and a second array block. The first and second array blocks are substantially similar. The first and second array blocks are separated by a first interconnect routing channel through which a first group of local interconnect lines extend. A first higher-level-2 (HL2) block is formed. The first HL2 includes at least the first HL1 block and a second HL1 block. The second HL2 block being substantially similar to the first HL1 block. The first and second HL1 blocks are separated by a second interconnect routing channel through which a second group of local interconnect lines extend orthogonally to the first group of local interconnect lines. A first higher-level-3 (HL3) block is formed. The first HL3 includes at least the first HL2 block and a second HL2 block. The second HL2 block is substantially similar the first HL2 block. The first and second HL2 blocks are separated by a third interconnect routing channel through which a third group of local interconnect lines extend orthogonally to the second group of local interconnect lines. The third group of local interconnect lines are configured to carry input information for accessing one or more of said plurality of memory cells. The second group of local interconnect lines are configured to carry a subset S1 of the input information. The first group of local interconnect lines are configured to carry a subset S2 of the subset S1 of the input information.
Further features and advantages of the present invention will become more fully apparent form the following detailed description of the invention, the appended claims, and the accompanying drawings.