This invention relates to an architecture for memory organization, and more specifically, for a method and architecture using a folded addressing method.
Designers of Very Large Scale Integrated (VLSI) electronic circuits are increasingly pressed to reduce costs and development time. This is particularly true for microcontroller and microprocessor designs. In order to optimize substrate usage and minimize design cycle times, a layout plan of an Integrated Circuit (IC) is often constrained to have a fixed topology like the one shown in FIG. 1.
FIG. 1 shows a layout view of a typical IC 10. A central bus 20 carries signal interconnections within the IC 10. Signals are generated and received by a generic microprocessor 30, peripheral circuits 32, and memory circuits 34.
For a fast assembly of a complex device like the one depicted in FIG. 1, it is essential that all blocks are physically designed to have the same height, as shown by the fixed height arrows 36 on the right. In this way, a library of predefined macros or layouts can be designed and made available for a wide range of specific applications, allowing the complete IC 10 to be put together easily, e.g. by software, and with highest silicon utilization.
It is generally easy to shape the physical dimensions of digital and analog peripherals such that one side is at a fixed height. On the contrary, memory blocks introduce specific problems. In fact, for each possible memory function, there is a variety of configurations that can be used in a design.
Although the memory width, or parallelism, which is the number of bits that can be loaded into/from the memory in parallel, NB, is generally fixed for a given IC architecture, the memory depth, which is the number of different locations independently addressable, or words NW varies with target applications and marketing strategies. It is therefore necessary to implement memory architectures that can expand in the direction of expandable arrows 38 in the memory circuits 34 of FIG. 1. This expansion allows the memory circuits 34 to be specifically designed to match the memory needs of the peripheral circuits 32 and the generic processor 30 when designing the IC 10.
For efficiency reasons, memory blocks are designed with a bi-dimentional decoding scheme. In all but very small memory blocks, the input address is decoded by two separate sub-circuits. The word decoder selects a physical row out of the number of Rows NR in a memory circuit. On each row, NM memory words are allocated. A column multiplexer (or mux) then selects one out of the NM words and routs it to the data I/O circuitry.
NR=NW/NM Number of physical rows in the memory matrixxe2x80x83xe2x80x831)
NC=NMxc3x97NB Number of physical columns in the memory matrixxe2x80x83xe2x80x832)
The addressing space defined by NW must be continuous. If there are voids or holes in the address space, i.e., inputs between address=0 and address=NW corresponding to no physical location, all external subsystems accessing the memory block would have to perform complex processing to guarantee proper data storage and/or retrieval around the voids, which is not acceptable.
Usually, best performances, lowest power consumption, and highest memory density are achieved for NR≈NC, or a generally square architecture where the number of rows is roughly equal to the number of columns. This implies (from equation 2) that optimal configurations are those for which
NM≈NR/NB NB generally equal to or greater than 8xe2x80x83xe2x80x833)
Because of equation 3, there are generally many more address bits devoted to word decoding than are used for column multiplexing. In order to enlarge the overall memory depth variation, it is the word decoder that has to vary in the direction of the expandable arrows 38 in the memory circuits 34 of FIG. 1, while the column mux must remain fixed. In other words, memory circuits 34 have to be placed in FIG. 1 with the matrix rotated 90xc2x0, making rows physically vertically oriented, and columns physically horizontally oriented.
In order to satisfy the addresses space continuity, the most significant address bits decode one out of NR rows and the least significant address bits select one out of NM words. Moreover, to preserve address space continuity, the number of words on each row (NM ) has to be a power of two, or
NM=2mm=1,2,3,xe2x80x83xe2x80x834)
So the first row allocates all the memory words (MW) numbered from 0 to 2mxe2x88x921, with no voids, the second row consists of words 2m to 2m+1 xe2x88x921, etc. If it is assumed a memory cell is designed for highest density and its dimension in the direction of the fixed height arrows 36 in FIG. 1 is hc, then the height (H) of the memory block can only be:
H=NCxc3x97hc=NBxc3x972mxc3x97hcm=1,2,3,xe2x80x83xe2x80x835)
Generally, the values of H obtained through equation 5 are not optimal for the full IC 10, because the IC dimensions are mandated by efficiency and reliability specifications that cannot take into account all possible memory architectures. Hence, the need for a memory architecture that can preserve addressing continuity and optimize H to any value dictated by IC product definitions.
Some limited degree of freedom in defining H can be achieved through the so-called remainder technique. Assume the NM corresponding to the optimal H is:
2/3N less than NMopt less than N N=2n; n=1, 2, 3xe2x80x83xe2x80x836)
Then, memory rows can be designed (NMoptxc3x97NB) bits wide, and logically arranged into triplets. In a triplet arrangement, the first memory words are placed on the first row, which has a vertical orientation in the memory circuit 34 of the FIG. 1. The next Nxe2x88x92NMopt words constitute a remainder R1 and are kept aside. The words NMopt+1 to 2xc3x97NMopt are placed in the second row, which is also vertical the memory circuit 34 of FIG. 1. The next Nxe2x88x92NMopt words constitute a remainder R2. R1 and R2 are allocated on the third row of the triplet, denoted as the remainder row in a triplet arrangement. The next row, which is the fourth, starts a new triplet, and the remainder patterns are repeated until the memory addresses exhausted.
If R1+R2 less than NMopt, then the remaining bits to fill the remainder row are left unused, which is very expensive in terms of substrate usage efficiency and cost. Limitations of the remainder method are numerous. The method applicability is limited by equation 6. The column mux can be so complex to require automatic synthesis tools to design the address bits scrambling and unused cell reject logic required by remainder rows. The word decoder is also complicated, as it has to differentiate between the first pair of each row triplet and the remainder row. Because of the points above, the method implies a substantial silicon overhead on memories with small/medium arrays, and a reduction in the performance of the memories.
Regarding the layout of a possible memory block using the remainder method, the entire memory floor plan and layout is affected, so that the memory is not reusable in IC configurations that are different from the one in FIG. 1, unless the full silicon and performance overhead of the remainder method is acceptable.
Also, in most cases, the silicon wasted on the unused cells in the remainder row deeply impacts the overall memory density. For instance, if NMopt=xc2xe N, then the remainder row would leave {fraction (1/3,)} or nearly 10% of the total memory cells unused. This inefficiency is too high for most applications.
Although the remainder method is in principle not limited to triplet arrangement, and can use any n-tuple, the circuit complexity and the area overhead increase unacceptably for sets having 4 or more rows.
Until now, no method or memory architecture has been developed which can simultaneously preserve address continuity while optimizing the height H of a memory device to a value dictated by an IC product specification, rather than the optimizing the height H based on the memory specifications itself.
Embodiments of the invention include a first and second block of memory cells having rows that cross both the blocks, but columns in only one of the blocks. A word decoder is coupled to the rows of memory cells and can select one of the rows in the first and second block. A column decoder is coupled to the columns of memory cells and can select a set of columns from the first and second blocks, depending on addresses input into it. An address splitter is coupled to both the word decoder and to the column decoder, and passes relative portions of the address to each. In one embodiment, the address splitter passes the most significant bits of the address to the word decoder, and passes the remaining bits to a portion of the column decoder coupled to the first block only. The address splitter also modifies the remaining bits, and passes them to a portion of the column decoder coupled to the second block only. In this embodiment, the address splitter is coupled to a bit subtractor in order to perform the bit modification.
In another embodiment, a method of operating a memory device includes accepting an address at an input address circuit, and then determining, based on the address, whether it addresses data in the first block or in the second block. In one embodiment it assess this information by comparing it to the number of memory cells in the first block. If the address data is for data in the first block, the address is parsed into high and low portions, and sent to the word and column decoders, respectively. If the address data is for data in the second block, the low portion of the address is modified, or remapped, and then sent to a circuit in the column decoder that is only coupled to the second data block.