1. Field of the Invention
The present invention relates generally to the field of integrated circuits.
2. Description of the Related Art
As faster microprocessors become available, data processing systems operate at higher speeds, requiring faster bus frequencies and faster and larger memory devices. Currently existing memory devices, such as Static Random Access Memory Devices (SRAMs), operate with buses that transfer data at a frequency of 66 megahertz (MHz). However, as the need grows for faster bus frequencies of 100 MHz and beyond, existing memory devices can not be accessed fast enough to keep up with these faster bus frequencies.
To speed up the access time of these memory devices, the AC timing requirements of SRAMs have been improved, specifically the read and write parameters. The SRAM devices available today have evolved from the generic Asynchronous SRAM device to specialty SRAMs designed for particular applications. Specialty SRAMs include the Synchronous Burst SRAM and the Synchronous Pipelined Burst SRAMs. By adding performance enhancing features to the generic SRAM, such as self-timed writes, burst counters, and output registers, specialty SRAM devices are designed to operate in faster environments, improving the overall system performance.
Although the improved SRAM architecture has enabled these memory devices to interface with buses having frequencies up to 66 MHz, these improvements alone do not achieve the 100 MHz and beyond performance levels. Designing an SRAM device that has the ability to interface with faster bus frequencies at these levels has been proven to be a difficult task, especially over process and user comers (Vcc, temperature, . . . etc.). Therefore, other design considerations such as board lay-out and routing, on-chip clock skew and process variations also need to be taken into account to improve the performance level of the overall system design.
The overall system performance can be improved by providing a board-level clock distribution scheme. FIG. 1 is an illustration of a Personal Computer (PC) motherboard having an improved clock distribution scheme by using a dedicated clock synthesis device 13 which generates several synchronized system clocks that are distributed to the different components on the PC motherboard 100. The crystal 14 generates a frequency for the clock synthesis device 13. The clock synthesis device 13 provides the CPU 10 a system clock signal over line 15a, the chipset 11 a system clock signal over line 15b, the SRAM devices 12a and 12b a system clock signal over line 15c, and the SRAM devices 12c and 12d a system clock signal over line 15d. The CPU 10 and the chipset are coupled to the SRAM devices 12a-12d by lines 16a, 16b and 16c which carry the address, data and control signals, respectively, to provide read and write access to the SRAM devices 12a-12d.
The clock synthesis device 13, such as the CY2254 manufactured by Cypress Semiconductor, guarantees a certain clock skew between the different clock signals generated by the clock synthesis device 13. For example, the clock skew between the system clock signals sent over line 15a, 15b, 15c and 15d is 250 picoseconds (ps). A PC motherboard operating at 66 MHz can function properly with the 250 ps clock skew. For example, when the SRAM is operating at 66 MHz, the write set-up timing requirement is 2.5 nanoseconds (ns) and the write hold timing requirement is 0.5 ns, both with respect to the rising edge of the clock signal. When the system clock signal distributed to one of the SRAMs has a clock skew of 250 ps or 0.25 ns, the write hold-time period of 0.5 ns is reduced to half (0.5 ns-0.25 ns=0.25 ns). Although a 250 ps skew is adequate to guarantee system timing at 66 MHz, it is not sufficient to guarantee the system timing at a higher frequency (such as 100 MHz).
The clock synthesis solution discussed above guarantees a clock skew of about 250 ps between the system clock distributed over the various lines 15a-d, however, it does not take into account the process variations of an integrated circuit or the on-chip clock skew at each input and output pin of each integrated circuit. Furthermore, the clock synthesis solution does not compensate for the board-level clock skew associated with each integrated circuit on the board. Therefore, other methods of improving the overall system performance must be considered in achieving faster bus frequencies.
The clock distribution of the overall system (i.e. PC motherboard 100) can be further improved by using a programmable clock skew buffer, such as the CY7B991 device manufactured by Cypress Semiconductor, in place of the clock synthesis device 13. This solution compensates for the routing delays associated with the various integrated circuits in the system, and therefore takes into account the clock skew associated with the board lay-out. By using the programmable clock skew buffer, the system clock signal that is distributed to the various components in the system can be adjusted to compensate for some of the routing delays. In other words, the various clock signals distributed over lines 15a-15d can be adjusted by advancing or delaying the system clock signal to offset some of the board-level skew.
The drawback to using the programmable clock skew buffer is that the programmable clock skew buffer, such as the CY7B991, is expensive for specific applications such as personal computer applications. Furthermore, other variables such as process variations of an integrated circuit and on-chip clock skew at each input and output pin of the various integrated circuits in the system are not taken into account.
Another solution addresses the on-chip clock skew associated within an SRAM memory device. By changing the chip-level design to offset the effects of the on-chip clock skew, the margin for the SRAM read and write parameters is improved. In a typical SRAM device, there is a certain amount of clock skew across the device such that the circuit elements nearest to the input or output pads receive the clock input signal sooner than the circuit elements that are farthest away from the on-chip clock signal. The circuit elements may be the input or output buffers associated with each input or output pin of the SRAM device. To account for the clock skew between the different circuit elements, the SRAM device is designed to center the set-up and hold time parameters to the "middle of the road" scenario to minimize the overall effect of the on-chip skew. FIG. 2 is an illustration of the "middle of the road" approach.
FIG. 2 illustrates an overview of an SRAM device at the die level. FIG. 2 includes memory array 21 surrounded by a plurality of input and output pads. Input pads 20b-20l are coupled to pad 20a which receives the clock input signal. Associated with each input pad 20b-20l is a latch (not shown) that operates as a storage element for each pad.
Assuming there is a total of 400 ps clock skew across the SRAM device 22, the latches will be designed to the middle of the distribution (e.g. 200 ps). In other words, the latches are all designed based upon the center of the distribution, and the worst case skew is +/-200 ps. Thus, referring back to FIG. 2, the latch for pad 20b receives an input clock signal that is advanced by 200 ps, the latch for pad 20g receives an input clock signal that has no skew, and the latch for pad 20l receives an input clock signal that is delayed by 200 ps all with regard to the design center. All latches receive the same clock, but are designed as if they were all at the center of the distribution. Although this approach may improve the overall effect on the set-up and hold time margins, a certain level of error due to the on-chip clock skew (approximately 50%) is still assumed. Thus, this solution alone does not enable the SRAM device to operate in a faster environment such as a 100 MHz system and beyond.
As discussed above, there are several skew components (i.e. board-level clock skew, on-chip clock skew, process variations and trace length mismatches of an integrated circuit, etc.) that affect the input and output timing of SRAM devices. Each of the above mentioned solutions does not address all of the variables in one solution, therefore not optimizing the AC timing parameters of an SRAM. Thus, it is desirable to take into account all skew components mentioned above by optimizing the timing of each input and output.