If a user desires an embedded volatile random access memory (RAM), there are generally two choices available: static random access memory (SRAM) and dynamic random access memory (DRAM). DRAM requires just one transistor per storage cell whereas SRAM requires six transistors per memory cell so DRAM demands less die area, thereby being cheaper to manufacture than SRAM. However, the cross-coupled inverters in an SRAM cell help drive the bit lines during a read operation whereas a DRAM memory cell just provides the ephemeral charge stored on its relatively small storage capacitor. Thus, SRAM is much faster than DRAM. So a designer has two choices: cheap and slow (DRAM) vs. fast and expensive (SRAM). For this reason, SRAM tends to be reserved for time-critical implementations such as caches. Given the expense of implementing SRAM, it is desirable to optimize SRAM performance.
One barrier to optimizing performance of SRAMs is that they must respond to an external clock. This limits the SRAM with a variety of restrictions. For instance, suppose an SRAM write operation is triggered by the rising edge of an external clock. In a write operation, the SRAM's X-decoder (word line driver) decodes an address so as to assert the corresponding word line. The asserted word line will be de-asserted after the write operation is completed. This reset of the word line is typically triggered by the subsequent falling edge of the external clock. Thus, the write operation needs to be completed during a half clock cycle (assuming a 50-50 duty cycle) in which the external clock is held high. A read operation is similar in that it too must be completed during a half-cycle of the external clock. Conversely, should the SRAM be triggered by the falling edge of the external clock, it must complete its read or write operations during the time the external clock is held low. For a double-data-rate SRAM, the read/write operations would have to be completed within each half of the external clock cycles. Furthermore, the SRAM is subject to the clock jitter and other timing problems of the external clock as well. Therefore, there is a need in the art for an improved RAM design that is independent of the duty cycle and jitter of an external clock.
RAM performance is also affected by an efficient sense command generation. In general, a RAM must model the delay necessary to develop the word line voltage to drive the gates of access transistor(s) such that the accessed memory cell couples to the bit line. Having modeled this delay, the RAM must then model the bit line voltage development. Having modeled the word line voltage development and the bit line voltage development, the RAM may then assert a sense command such that a sense amplifier coupled to the developed bit line may make a bit decision as to the binary contents of the accessed memory cell. The bit line voltage development must be buffered up to trigger the sense command generation. This buffering involves delay and thus reduces the effective speed of the RAM. Accordingly, there is a need in the art for improved sense command generation schemes.
As discussed above, an SRAM memory cell includes cross-coupled inverters that actively drive the contents of the cell onto the corresponding bit lines. The SRAM sense amplifier detects a resulting bit line voltage development to make a bit decision. As memory densities continue to be enhanced, the capacitance of the SRAM sense amplifier becomes appreciable as compared to the capacitance of the bit lines. The higher the SRAM sense amplifier capacitance, the more power is consumed during read and write operations. Accordingly, there is a need in the art for improved SRAM sense amplifier architecture that provides reduced power consumption.
The x-decoder design is another critical area of RAM performance. The x-decoder decodes address bits so as to assert the appropriate word line and is thus also denoted as a row decoder. The x-decoder typically is triggered to decode a presented address through a rising or falling edge in an externally-provided clock. Once that external clock has triggered a decoding operation, whatever source that is providing the address to the x-decoder is then free to change the address bits so as to prepare for a read or write operation at the next clock cycle. Thus, it is conventional for an x-decoder to latch or register the presented address bits so that they do not change while the external source is changing the address bits for the next clock cycle operation. This latching of address bits consumes power and introduces delay. Accordingly, there is a need in the art for improved x-decoder architectures.