This invention relates in general to memory architecture, and in particular to a flexible memory architecture implementation that can be easily adapted in response to compositional changes in the die of a chip.
Computer systems may employ a multi-level hierarchy of memory, with relatively fast, expensive but limited-capacity memory at the highest level of the hierarchy and proceeding to relatively slower, lower cost but higher-capacity memory at the lowest level of the hierarchy. The hierarchy may include a relatively small, fast memory called a cache, either physically integrated within a processor or mounted physically close to the processor for speed. The computer system may employ separate instruction caches (xe2x80x9cI-cachesxe2x80x9d) and data caches (xe2x80x9cD-cachesxe2x80x9d). In addition, the computer system may use multiple levels of caches. The use of a cache is generally transparent to a computer program at the instruction level and can thus be added to a computer architecture without changing the instruction set or requiring modification to existing programs.
Turning to FIG. 1, an integrated circuit (chip) design of the prior art for a microprocessor is shown. As shown in FIG. 1, the design for chip 100 includes a non-memory block, shown as CPU core 40, which may include such components as an ALU for integer execution, a floating point execution unit, and lower level caches (e.g., level 1 cache), as well as other various components. Therefore, as used herein a non-memory block may refer to a non-memory portion of an integrated circuit, such as the CPU core of a microprocessor chip. Furthermore, such a non-memory block may comprise several smaller, non-memory components therein, such as an ALU, floating point execution unit, and other non-memory components of a microprocessor chip. As processor speeds increase and greater performance is required for processors, it becomes increasingly important for larger caches to be implemented for a processor. As described above, cache memory is typically capable of being accessed by a processor very quickly. Thus, the more data contained in cache, the more instructions a processor can satisfy quickly by accessing the fast cache. That is, generally, the larger the cache implemented for a processor, the better the performance of such processor. Therefore, processor chips of the prior art commonly implement large cache structures. For example, as shown in FIG. 1, a higher level memory (e.g., level 2 cache) is implemented on the processor chip in memory blocks 10, 20, and 30. It is common in prior art designs for such additional memory to consume half (or even more) of the surface area of the die for a chip.
In memory architecture (or memory organization) of the prior art, memory blocks, such as memory blocks 10, 20, and 30, are typically implemented in relatively large, rectangular (or square) blocks. For example, memory blocks are commonly implemented having 256 by 256 memory cells, 512 by 512 memory cells, or 1024 by 1024 memory cells. Such memory blocks of the prior art are typically limited to being rectangular blocks. Each of the blocks 10, 20, and 30 typically have an independent decode and input/output (I/O) circuits. For example, block 10 may have a decode circuitry 12 and I/O circuitry 13 that is 10 utilized for the entire memory block 10. That is, a common decode circuitry 12 and I/O circuitry 13 is typically utilized for the large memory block 10.
In integrated circuit designs of the prior art, a large rectangular block of memory, such as memory block 10, 20, or 30 of FIG. 1, typically comprises approximately 10 to 50 percent of the total memory implemented within the integrated circuit. Therefore, each block of memory typically provides a relatively large percentage of the total memory implemented in an integrated circuit. Also, because of the relatively large size and inflexible shape of prior art memory blocks, a relatively small number of blocks are typically implemented within an integrated circuit 100 of the prior art. For example, in prior art designs, typically no more than 10 memory blocks are implemented within an integrated circuit. Moreover, the memory blocks implemented in integrated circuits that comprise non-memory components are typically larger in size than most of the non-memory components implemented within such integrated circuit. For example, in a microprocessor chip 100, memory blocks 10, 20, and 30 are typically larger than most of the non-memory components contained within the CPU core 40, such as the ALU, floating point execution unit, etcetera.
Because the memory blocks 10, 20, and 30 of the prior art are typically implemented only as relatively large, rectangular blocks of memory, the organization of such memory within the chip 100 is very inflexible. For example, suppose in developing the core 40 for chip 100 a component, shown as component 42, needs to expand in size, thus requiring such component 42 to consume more surface space. For example, suppose that in designing component 42, it had to expand in size, in the manner illustrated in FIG. 1, in order to achieve its performance target. As shown in FIG. 1, it may be necessary for component 42 to expand such that it violates the boundary of rectangular cache block 10. Such a violation of cache block 10 is extremely problematic in prior art designs because it is very difficult to redesign prior art cache block 10 around the expanding component 42. For example, it is very difficult to redesign cache block 10 such that its upper, lefthand corner is cut out to make room for the expanding component 42. Therefore, such a redesign of cache block 10 would typically be very complex and time consuming, and therefore presents a large cost obstacle in designing the cache block 10 around the changing composition of the chip, as needed. For example, the large arrays of the prior art depend on their rectangular structure to share drivers and decoders.
Because of the great difficulty involved in redesigning such prior art cache block 10 to various shapes and sizes to respond to the changing composition of a chip (e.g., the expansion of component 42), designers typically respond to such changes in composition by moving (or relocating) an entire memory block within the chip. So, for example, in response to the changing size of component 42, which would otherwise violate the boundary of cache block 10, a designer of the prior art chip 100 would typically attempt to relocate the entire cache block 10 to a new location on chip 100. Often, such a relocation of the large, rectangular cache block 10 results in an undesirably large amount of white space (i.e., unused surface space of a chip) on the die. Additionally, sufficiently large blocks of space may not be available on the surface of chip 100 in which to relocate such a large rectangular block of cache. Thus, a smaller overall amount of cache memory may have to be implemented within chip 100 because sufficient large blocks of space are not available for implementing one or more of the large rectangular blocks 10, 20, and 30. For example, because the memory block 10 is likely much larger than the non-memory component 42, it is difficult to rearrange the memory block 10 around the expanding non-memory component 42 in a desirable manner (e.g., that does not result in a large amount of white space on the chip 100.) Therefore, organizing memory blocks within a chip of the prior art is typically a very difficult and complex task because of the inflexibility of the large, rectangular blocks commonly implemented in such prior art designs. That is, the large, rectangular blocks of memory typically implemented in prior art designs are very inflexible and result in great difficulty in reorganizing such memory blocks in response to changes in the composition of a chip.
In memory architecture of the prior art, memory blocks, such as blocks 10, 20, and 30 of FIG. 1, are commonly implemented with redundancy. For example, each memory block 10, 20, and 30, may each comprise smaller sub-blocks of memory therein. Also, each memory block 10, 20, and 30 may include a redundant sub-block therein, such as redundant sub-blocks 11, 21, and 31. It is common in manufacturing (or xe2x80x9cfabricatingxe2x80x9d) memory blocks within a chip that a defect may occur within a portion of a memory block. That is, a portion of a memory block may not allow for the proper storage and/or retrieval of data. Accordingly, redundant sub-blocks are typically implemented within such memory blocks, which can be used to effectively replace a defective sub-block of memory within the chip. For example, redundant sub-block 11 may be utilized to replace a defective sub-block of memory within the large memory block 10. Likewise, redundant sub-blocks 21 and 31 may each be utilized to replace defective sub-blocks within memory blocks 20 and 30, respectively.
Each sub-block of memory may typically be referred to as a xe2x80x9ccolumnxe2x80x9d of memory. However, such a xe2x80x9ccolumnxe2x80x9d of memory may actually comprise multiple columns and rows of memory cells. As shown in FIG. 1, a redundant sub-block (or column) is typically utilized to repair a defective column within a memory block. Accordingly, defective columns may be repaired by re-routing data from a defective column to the redundant column for a block of memory. However, in typical prior art designs implementing such column redundancy, defects that exist in xe2x80x9crowsxe2x80x9d of a memory block may not be repairable by a redundant column. Moreover, a redundant sub-block (e.g., redundant column) typically does not allow for repairing defects in the memory block""s decoder circuitry or I/O circuitry. Thus, some defects that may occur within a memory block are not capable of being repaired with a redundant sub-block of a prior art design. Therefore, prior art redundancy implementations typically allow little flexibility in repairing defects of a memory block.
In view of the above, a desire exists for a memory architecture that provides flexibility in how the memory may be organized within an integrated circuit. That is, a desire exists for a memory architecture that provides sufficient flexibility to enable designers to easily organize the memory component of an integrated circuit around other components of the integrated circuit in a desirable manner. A further desire exists for a memory architecture that provides greater flexibility in repairing defects within the memory component of an integrated circuit. That is, a desire exists for a memory architecture that enables a greater number of defects to be repairable through redundancy within the memory component of an integrated circuit.
These and other objects, features and technical advantages are achieved by a system and method which implement a memory component of an integrated circuit as multiple, relatively small sub-arrays of memory. In a preferred embodiment, the memory component of an integrated circuit is implemented as multiple, relatively small sub-arrays of memory, which enable a designer great flexibility in arranging such sub-arrays within an integrated circuit. That is, the small sub-arrays of memory enable a designer to easily arrange the memory component of an integrated circuit around the non-memory components of such integrated circuit in a desirable manner. Thus, a designer may arrange the sub-arrays of memory around the non-memory components of an integrated circuit such that the non-memory components do not violate the boundary of the memory component. Further, a designer may arrange the sub-arrays of memory in a manner that minimizes the amount of white space on an integrated circuit. Alternatively, a designer may arrange the sub-arrays of memory in a manner that provides a desired amount of white space strategically positioned within an integrated circuit to provide margin around portions of the integrated circuit that have uncertain dimensions early in the design stages.
In a preferred embodiment, the memory component of an integrated circuit is implemented as multiple memory sub-arrays that are each independent. For example, in a preferred embodiment, each memory sub-array comprises its own decode circuitry for decoding memory addresses that are being requested to be accessed by an instruction, and each memory sub-array comprises its own I/O circuitry. Thus, in a preferred embodiment, each memory sub-array is physically and electrically independent of the other memory sub-arrays.
In one implementation of a preferred embodiment, each of the independent memory sub-arrays implemented in an integrated circuit comprises no more than approximately 5 percent of the total memory implemented on the integrated circuit. Most preferably, each of the independent memory sub-arrays implemented in an integrated circuit comprises approximately 1 percent of the total memory implemented on the integrated circuit. In another implementation of a preferred embodiment, each of the independent memory sub-arrays on an integrated circuit is no larger than approximately the average size of other non-memory components implemented on the integrated circuit. Therefore, in a preferred embodiment, each independent sub-array is relatively small in size to enable great flexibility in organizing the memory on an integrated circuit. Additionally, in a preferred embodiment, the memory component of an integrated circuit comprises at least 20 independent memory sub-arrays. More preferably, the memory component of an integrated circuit comprises at least 30 independent memory sub-arrays, and even more preferably, the memory component of an integrated circuit comprises at least 50 independent sub-arrays. Additionally, in a most preferred embodiment, the memory component of an integrated circuit comprises approximately 100 independent sub-arrays. In a most preferred embodiment, the integrated circuit comprises a processor and the memory component of the integrated circuit comprises a cache for the processor, and most preferably such memory component comprises at least 1 megabyte of cache memory for the processor.
As discussed above, in a preferred embodiment, the memory component of an integrated circuit is implemented as multiple, small sub-arrays, which enable great flexibility in organizing the memory component within an integrated circuit. As also discussed above, in a preferred embodiment, each sub-array is implemented as an independent, stand-alone array of memory. As a result, such independent sub-arrays of memory may be implemented as redundant sub-arrays that are capable of effectively repairing any defect within another sub-array. That is, redundant sub-arrays can be implemented within the memory component of an integrated circuit that are capable of replacing a defective sub-array (e.g., by rerouting data from the defective sub-array to the redundant sub-array). Because the entire defective sub-array is replaceable with a redundant sub-array, a preferred embodiment provides great flexibility in repairing any defect that is detected within a memory sub-array.
It should be appreciated that a technical advantage of one aspect of the present invention is that a flexible memory architecture is provided. Accordingly, a memory architecture of a preferred embodiment allows a designer great flexibility in organizing a memory component of an integrated circuit. For example, a memory architecture of a preferred embodiment allows a designer to readily respond to compositional changes within an integrated circuit by easily reorganizing the memory component of such integrated circuit. A further technical advantage of one aspect of the present invention is that the memory component of an integrated circuit may be organized in an optimum manner. For example, the memory component of an integrated circuit may be organized in a manner that minimizes the amount of white space within the integrated circuit (e.g., by arranging sub-arrays of memory on substantially all of the available white space of a chip). As another example, the memory component of an integrated may be organized in a manner that provides a desired amount of white space positioned strategically within the integrated circuit during the design phase. It should be recognized that in general, a designer""s goal is to minimize the amount of white space present in an integrated circuit at the end of the design phase. However, during the design phase it may be helpful to budget white space within the integrated circuit to be used as margin when other components (e.g., the CPU core) within the circuit grow, as they often do throughout the actual design phase. Yet a further technical advantage of one aspect of the present invention is that great flexibility is available in repairing defects within the memory component of an integrated circuit. That is, because the entire defective sub-array is replaceable with a redundant sub-array in a preferred embodiment, such a preferred embodiment provides great flexibility in repairing any defect that is detected within a memory sub-array.
The foregoing has outlined rather broadly the features and technical advantages of the present invention in order that the detailed description of the invention that follows may be better understood. Additional features and advantages of the invention will be described hereinafter which form the subject of the claims of the invention. It should be appreciated by those skilled in the art that the conception and specific embodiment disclosed may be readily utilized as a basis for modifying or designing other structures for carrying out the same purposes of the present invention. It should also be realized by those skilled in the art that such equivalent constructions do not depart from the spirit and scope of the invention as set forth in the appended claims.