The present invention relates to SRAM interfaces for controlling embedded SDRAM for system on chip (SOC) applications.
Commodity SRAM and DRAM memory devices are typically used in personal computer systems for data storage. In a personal computer system, the memories are controlled either by the CPU or by an on-board memory controller. To ease system level design, commodity memories are usually required to meet minimum industry performance standards, or system manufacturer imposed performance standards such as the Intel PC133 SDRAM standard. These standards allow for the design of memory controllers that can maximize the performance of any memory that meets industry standards. Hence signalling and timing details for operating the memory are somewhat transparent to system designers since they are under the control by a CPU or memory controller. However, the limited bandwidth, caused in part by the capacitance and inductance of wire leads of the external memory prevents the memory from operating at its full potential.
One solution to this problem is to embed memory onto processors and other system-on-chip (SOC) devices, such as application specific integrated circuits (ASIC). But because SOC devices are usually application specific, there are very few standards that need to be adhered to due to the customized nature of the memory.
SRAM is commonly embedded in SOC devices because of the compatibility of its manufacturing process with logic manufacturing processes. The drawback of embedding SRAM is the relatively large area it occupies for a small storage density. New manufacturing processes allow DRAM to be embedded onto SOC devices. Embedded DRAM, such as SDRAM, are practical for SOC devices where large amounts of memory are required without compromising excessive silicon area. This high level of integration reduces costs and provides other well-known system level benefits, especially for portable applications, or in applications where physical space is limited.
Despite the advantages of embedding DRAM over SRAM onto SOC devices, overall system performance of embedded DRAM remains inadequate for high-speed applications. Control of an SRAM does not require many signals, and the timing of these signals is not heavily constrained. Hence SOC designers have been able to directly access embedded SRAM with minimum additional peripheral logic. For example, a read operation from an asynchronous SRAM only requires a write enable signal (WE) to be at the high logic level and a change in an address from which data is to be read from. SDRAM on the other hand, is more complex because it requires more signals, and the timing of the signals are tightly constrained within preset limits. Memory addresses are multiplexed into row and column addresses, and specific combinations of column address strobe (CAS), row address strobe (RAS), write enable (WE), chip select (CS) and specific address signals are applied to issue specific commands which determine the DRAM operation. In a read operation, row addresses must be asserted during a xe2x80x9cbank activationxe2x80x9d command cycle. Then a fixed time interval must pass before a xe2x80x9creadxe2x80x9d command and column addresses can be asserted. This fixed time interval is typically specified by the SDRAM manufacturer, and can vary from manufacturer to manufacturer. Due to this additional complexity, simple interfaces, or emulators, that allow SOC processors to transparently access embedded DRAM have been developed.
SRAM interfaces have been chosen because SRAMs are simple and straightforward to access. In other words, the SOC processor xe2x80x9cseesxe2x80x9d an SRAM device through the SRAM interface and issues SRAM control signals to access the SDRAM memory. The SRAM interface then generates the appropriate SDRAM control signals and converts received linear SRAM addresses into separate row and column addresses. Just as importantly, the SRAM interface also controls timing for activating the appropriate SDRAM control signals.
FIG. 1 is a block diagram of a prior art graphic processing ASIC that uses embedded DRAM memory. This ASIC has a video codec engine (VCE), digital signal processor (DSP), video processing unit (VPU), memory interface 50 and SDRAM memory 52 divided into two different blocks. SDRAM memory 52 can be stacked, trench or planar capacitor DRAM. Memory interface 50 is an SRAM interface, which generates SDRAM control signals from SRAM type commands. Unfortunately, prior art SRAM interfaces generate SDRAM control signals with worst-case scenario timing. More specifically, the SDRAM control signals are activated at times well beyond the minimum required time. This is mainly due to the fact that the internal SDRAM clock signals are generated synchronously to the external system clock of the SOC device. FIG. 2 shows a read access timing diagram for the system of FIG. 1.
The timing diagram of FIG. 2 shows traces for the system clock signal SCLK and addresses and commands ADDR/CMND received by memory interface 50 for generation of activate signal ACT, row clock RCLK, column clock CCLK, precharge clock signal PCHCK and output data Q. This example illustrates a read operation from the system of FIG. 1. The SDRAM memory provides data Q in response to the signals generated by the memory interface 50. Commands are latched on the first rising edge of SCLK 60, and decoded to generate the active ACT signal shortly thereafter. The rising edge of ACT triggers the generation of the RCLK pulse for latching a row address to activate the appropriate memory bank. Another command is latched and decoded on the second rising edge of SCLK 62 for generating the CCLK pulse. A column address is latched on the rising edge of the CCLK pulse, resulting in the output of valid data Q shortly thereafter. A PCHCK precharge pulse is then generated to precharge all the memory banks in preparation for subsequent accesses. The system of FIG. 1 requires a minimum of two clock cycles to provide data after the initial read command and row address is latched. This is due to the fact that the row control signal RCLK is generated in the first system clock cycle and the column control signal CCLK is generated in the subsequent system clock cycle.
Unfortunately, the embedded SDRAM is capable of providing data earlier than the system of FIG. 1 allows. More specifically, the embedded SDRAM is capable of latching column addresses earlier by issuing the CCLK pulse earlier. But because the row and column clock signals RCLK and CCLK are synchronized to the system clock SCLK, the earliest that the CCLK pulse could appear is after the second rising edge of the system clock. In a practical example, if the SDRAM core has a minimum access time of 5 ns, the time between CCLK and valid data is 2.5 ns, precharge requires 1.5 ns and the SCLK is a 100 MHz clock with a period of 10 ns, the system of FIG. 1 would require a minimum of 12.5 ns to generate valid data. 12.5 ns is the sum of one fall clock cycle time plus the CCLK to valid data time. However, the SDRAM memory is capable of providing data in 7.5 ns. In operation though, the system of FIG. 1 only provides new data every 20 ns, or every two clock cycles. The SDRAM on the other hand, is capable of providing new data every 9 ns, which is the sum of 7.5 ns as previously discussed plus 1.5 ns of precharge time.
Some SRAM interface designs attempt to improve SDRAM access times by using clock multipliers to generate intermediate high frequency clock signals from the external system clock. Although this technique will increase SDRAM performance, it is not cost effective because it is difficult to design a clock circuit that will reliably generate a high frequency clock signal. Additionally, this technique does not fully optimise SDRAM performance because it is inherently difficult to control an SDRAM that operates in a clock domain having finite frequency granularity.
Hence, SOC devices having embedded SDRAM memory will not operate at their full potential due to limitations in the SRAM interface that control them.
Therefore, there is a need for an SRAM interface that generates control signals with the appropriate timing for maximizing embedded SDRAM performance.
It is an object of the present invention to obviate or mitigate at least one disadvantage of previous SRAM interface circuits. In particular, it is an object of the present invention to provide a system and method for maximizing embedded SDRAM performance.
In a first aspect, the present invention provides an interface circuit for controlling embedded DRAM memory having a row timing circuit for activating row decoders and bitline sense amplifiers, and column decoders for accessing bitlines. The interface circuit includes a command decoder for receiving command signals and providing control signals in response to a system clock, and a clock sequencer for activating the row timing circuit in response to the control signals, and for activating the column decoders at a predetermined delay after activation of the row timing circuit. The column decoders are activated after the bitline sense amplifiers are activated and within the same system clock cycle that the row timing circuit is activated.
In an alternate embodiments of the present aspect, the command signals include SRAM control signals, the clock sequencer generates a precharge clock signal for precharging the bitlines after the column decoders are activated, and the row timing circuit generates a sense amplifier activation signal for turning on the bitline sense amplifiers. In yet another embodiment of the present aspect, the clock sequencer includes a row timing emulator for generating an emulated row timing signal at the same time the sense amplifier activation signal is generated, and a margin delay circuit for receiving the emulated row timing signal and generating a column clock signal for activating the column decoders.
In alternate aspects of the present embodiment, the row timing circuit is substantially identical to the row timing emulator and has a layout substantially identical to the layout of the row timing circuit. The margin delay circuit includes programmable delay circuits for delaying generation of the column clock signal, and receives a test signal for delaying generation of the column clock signal. In a further aspect of the present embodiment, the clock sequencer precharges the bitlines when an inactive clock cycle following an active system clock cycle is detected by a page mode control circuit.
In a second aspect, the present invention provides a method for accessing a memory bank of an embedded DRAM within a single clock cycle of a system clock controlled by an interface circuit synchronized to the system clock. The method includes receiving address and commands on an edge of the system clock, activating row decoders for driving a wordline of the memory bank corresponding to the address, activating bitline sense amplifiers of the memory bank, and activating column decoders of the memory bank at a predetermined delay time after the row decoders are activated.
In alternate embodiments of the present aspect, the delay time is longer for a read command than for a write command, and the bitlines of the memory bank are precharged after the column decoders are activated. In a further embodiment of the present aspect, the interface is set for page mode operation where the bitlines of unselected memory banks are precharged after address and commands are received, and the wordline is driven for the duration of the single clock cycle. In yet another embodiment of the present aspect, all memory banks are precharged in an inactive system clock cycle following an active system clock cycle.
In a third aspect, the present invention provides an interface circuit for controlling an embedded DRAM having column decoders, row decoders, bitline sense amplifiers and a row timing circuit for activating the row decoders and the bitline sense amplifiers. The interface circuit includes a row timing emulator, a margin delay circuit, a page mode control circuit, and a clock combiner. The row timing emulator receives an activation clock signal synchronized to a system clock for generating an emulated row timing signal. The margin delay circuit receives the emulated row timing signal for generating a column clock signal for activating the column decoders, and a precharge clock signal. The page mode control circuit receives a page mode signal, the system clock and the activation clock signal for generating a page mode precharge clock signal. The a clock combiner receives the page mode signal for activating the row timing circuit in response to one of a fast activation clock signal, the precharge clock signal, and the page mode precharge clock signal.
In alternate embodiments of the present aspect, the row timing emulator is substantially identical to the row timing circuit, the margin delay circuit includes programmable delay circuits for delaying generation of the column clock signal and the precharge clock signal, and the programmable delay circuits are configurable by registers. In another embodiment of the present aspect, the page mode control circuit includes programmable delay circuits for delaying generation of the page mode precharge clock signal. In yet another embodiment of the present aspect, the margin delay circuit generates the column clock signal in response to the emulated row timing signal after a first delay in a write operation and a second delay in a read operation, where the first delay is shorter than the second delay.
In another embodiment of the present aspect, the page mode control circuit is disabled when the page mode control signal is inactive and the page mode control circuit generates the page mode precharge clock signal when the page mode control signal is active and the memory access signal is inactive following an operation where the memory access signal is active. In yet another embodiment of the present aspect, the clock combiner generates the row clock signal in response to the precharge clock signal when the page mode control signal is inactive and the clock combiner generates the row clock signal in response to the page mode precharge clock signal when the page mode control signal is active.
In a fourth aspect, the present invention provides A method for customizing a row timing emulator and margin delay circuit of an interface circuit for controlling an embedded DRAM having a row timing circuit for activating row decoders and bitline sense amplifiers, and column decoders for accessing the bitlines. The method includes determining an optimum bitline sense amplifier activation delay time for activating the bitline sense amplifiers after row addresses are latched, determining an optimum column decoder activation delay time for activating the column decoders after the bitline sense amplifiers are activated, programming delay elements of the row timing circuit for activating the bitline sense amplifiers at the optimum bitline sense amplifier activation delay time, programming delay elements of the row timing emulator for generating an emulated row timing signal at the optimum bitline sense amplifier activation time, and programming delay elements of the margin delay circuit for activating the column decoders at the optimum column decoder activation delay time after receiving the emulated row timing signal.
In an alternate embodiment of the present aspect, the step of determining an optimum column decoder activation delay time includes determining the optimum column decoder activation delay times for read operations and write operations, where the margin delay circuit is programmed to activate the column decoders at the optimum column decoder activation delay times for read and write operations.
Other aspects and features of the present invention will become apparent to those ordinarily skilled in the art upon review of the following description of specific embodiments of the invention in conjunction with the accompanying figures.