Modern high-end microprocessors often demand larger amounts of cache memory (e.g., level 3 cache, or “L3” cache) to boost performance. Typically, memory cell sizes decrease as the microprocessor technology scales; and variation in the memory cell can be inversely proportional to the area that it takes. Accordingly, single bit failures inside large memory products are becoming a more critical issue. To maintain a desired manufacturing yield for such products, many such products include redundancy repair inside the memory to minimize process variation at some reasonable overhead.
Because bit failures can be statistically randomized across an entire memory array (the bit failure can occur in any bitcell of any sub-array of the memory), conventional redundancy approaches tend to add redundancy in each sub-array of the memory. For example, the redundancy can be configured for row repair or column repair, which can address bit failures in a manner that is fairly straightforward to implement, but can also add appreciable overhead and cost. Another conventional type of redundancy is duplicated block repair, which can be useful, for example, in memories that have extremely high clock frequencies, bank-interleaved access for supporting single-cycle throughput, and multi-cycle latency. Traditionally, the duplicated block repair has the same design as the primary memory arrays and uses multiple instantiations (e.g., the same number as the primary banks) in the cluster to meet throughput requirements.