1. Field of the Invention
The present invention generally relates to computer systems, specifically to a method of storing values in latches of a computer system, and more particularly latches which store state information for extended periods of time.
2. Description of the Related Art
The basic structure of a conventional processing unit 10 for a computer system is shown in FIG. 1. In this example processing unit 10 is a dual-core processor having two processor cores 12a and 12b which carry out program instructions in order to operate the computer. Processing unit 10 also includes a memory subsystem 14, a scan controller 16, and a JTAG interface 18. The computer system may be a symmetric multi-processor (SMP) computer which uses a plurality of processing units 10 that are generally identical, that is, they all use a common set or subset of instructions and protocols to operate, and generally have the same architecture. An exemplary processing unit includes the POWER5™ processor marketed by International Business Machines Corp. which comprises a single integrated circuit superscalar microprocessor.
Each processor core 12a, 12b has its own control logic 20a, 20b, separate sets of execution units 22a, 22b and registers/buffers 24a, 24b, respective first level (L1) caches 26a, 26b, and load/store units (LSUs) 28a, 28b. Execution units 22a, 22b include various arithmetic units such as fixed-point units and floating-point units, as well as instruction fetch units, branch units and instruction sequencer units. The processor cores may operate according to reduced instruction set computing (RISC) techniques, and may employ both pipelining and out-of-order execution of instructions to further improve the performance of the superscalar architecture. Registers 24a, 24b include general-purpose registers, special-purpose registers, and rename buffers. L1 caches 26a, 26b (which are preferably comprised of separate instruction and data caches for each core) and load/store units 28a, 28b communicate with memory subsystem 14 to read/write data from/to the memory hierarchy. Memory subsystem 14 may include a second level (L2) cache and a memory controller. Processing unit 10 may communicate with other components of the computer system (memory and various peripheral devices) via a system or fabric bus 30. To facilitate repair/replacement of defective processing units in the computer system, processing unit 12 may be constructed in the form of a replaceable circuit board or similar field replaceable unit (FRU), which can be easily swapped, installed in or swapped out of system 10 in a modular fashion.
Processor cores 12a, 12b and memory subsystem 14 (functional units) are clock-controlled components, while scan controller 16 and JTAG interface 18 are free-running components. JTAG interface 18 provides access between an external device such as a service processor and scan controller 16. JTAG interface 18 complies with the Institute of Electrical and Electronics Engineers (IEEE) standard 1149.1 pertaining to a test access port and boundary-scan architecture. Scan controller 16 uses a scan communications extension that is allowed by standard 1149.1. Scan controller 16 is connected to various sets of scan latches located in the clock-controlled components, three of which are shown in FIG. 1. Scan latches 32a and 32b are respectively located in the control logic 20a, 20b of cores 12a, 12b, while additional scan latches 32c are located in memory subsystem 14. Only three sets of scan latches are illustrated for simplicity, but there may be many more located throughout processing unit 10.
Scan controller 16 allows the service processor to access the scan latches while the components are still running, via JTAG interface 18. The scan latches on a given chip are connected in a ring fashion with scan controller 16. The scan latches may include internal control and error registers (along with mode and status registers) which can be used to enable and check various functions in the components. In this manner, the service processor can access any chip in the multi-processing system via JTAG interface 50 and access registers while the system is running, without interruption, to set modes, pulse controls, initiate interface alignment procedures, read status of fault indication registers, etc. Scan controller 16 carries out these functions by setting an internal command register and an internal data register. Assembly code running on a component, particularly in the processor cores 12a, 12b, can allow the cores to utilize scan features as well. Thus a core can read status bits of another component and control the logic anywhere on its own chip. Scan controller 16 includes appropriate logic to arbitrate between JTAG interface 18 and any assembly code commands from the two processor cores.
Information stored in scan latches usually includes mode configurations for clock control logic, and clock control latches can account for a significant fraction of the microprocessor latch count. Microprocessors typically use control logic in local clock buffers to adjust the duty cycle and edge stressing of various clock pulses in the system and thereby meet the requirements of the local logic circuits. These clock buffer modes are set at system power-on using scan controller 16, and often must maintain their logical value for days or months to ensure proper performance of the local logic circuits. However, these values can be upset during microprocessor operation, e.g., from stray radiation or electrostatic discharge. This upset is correctable by scanning a new value, but the system may only allow scanning at power-on, meaning that the system must be restarted if a clock control latch becomes incorrectly set.
Robust latches have been designed with error-correction circuitry to address this problem. The error-correction circuitry generally relies on redundancy, at either the latch level or the device (transistor) level. For example, the latch disclosed in U.S. Pat. No. 5,307,142 uses device level redundancy to achieve single event upset (SEU) immunity. That latch has cross-coupled inverters with voltage dividers that prevent the logic state of a single one of the inverters from changing. In U.S. Pat. No. 6,127,864, a temporally redundant latch samples logic data at multiple time-shifted periods to provide multiple (independent) data samples from which the correct data can be selected. That latch has three sampling circuits that sample the logic data at three different times. The circuit described in U.S. Pat. No. 6,504,411 uses triplicate latches and a majority voting circuit to provide resistance to SEUs. The majority voting circuit indicates a set state for the redundant latch circuit based upon a majority of the latches being in the set state, or otherwise indicates a reset state. A radiation resistant latch is disclosed in U.S. Pat. No. 6,826,090 which uses feedback circuitry to reinforce output signals of sublatches.
These latch designs reduce, but do not eliminate, the problem of upsets. For instance, in a redundant latch structure with a majority voting circuit that holds a logical state for an extended period, it is possible to have two separate upsets over an extended time, i.e., two of the three latches being set to an incorrect value, which then generates an incorrect output at the voting circuit. As a related issue, full redundancy in latch designs may be too costly in terms of physical size (chip area), speed, and power consumption. In modern, leakage power-dominated designs, it becomes increasingly important to reduce or eliminate any unnecessary redundancies. It would, therefore, be desirable to devise an improved latch design having less overhead that could still ensure reliability in case of single event upsets. It would be further advantageous if the latch could correct multiple errors resulting from more than one upset over time.