As transistors become smaller and smaller, the number of transistors integrated on unit area has increased rapidly, and the power consumption of integrated circuit has become a factor that is of the same significance as functionality and area. A dynamic voltage and frequency scaling (DVFS) technique, which aims at reducing the power consumption of circuit, has become an important power saving technique gradually owing to its remarkable effect.
DVFS depends on the monitoring of the operating state and performance of the main circuit. A main system-level monitoring means is to utilize sensors. Though such a means, it can reflect the current operating condition of the system to some degree, but off-chip monitoring often depends on the accuracy of the sensors and it is difficult to choose reliable monitoring points. Therefore, it is difficult to reflect the actual condition of each part in a chip by off-chip monitoring. Though a method that involves inserting critical units and replicating critical paths in a chip, it can reflect the variations of global parameters in the chip authentically, but the replicates and critical units and paths are not in the same on-chip environment, it is not sensitive to the variations of local parameters such as local noise and process fluctuations, consequently, it is difficult to reflect the actual condition of the circuits, and the voltage scaling effect is severely compromised.
An on-chip monitoring method monitors the operating condition of the circuit in real time by inserting on-chip monitoring circuits into the terminals of critical paths in the main circuit of the system, and boils down the impacts of process deviation, supply voltage fluctuation, temperature variation, and noise, etc. into variations of time-delay characteristics of the on-chip monitoring circuits in the critical paths. In case the voltages drops to be lower than a threshold voltage below which the circuit is error-prone, timing violations may occur in the on-chip logic. These timing violations are monitored by on-chip monitoring circuits, and then corresponding error signals are generated as a basis for voltage controlling of an operating voltage control module. An on-chip monitoring method can monitor the error level of the main circuit in real time during operation, and reflect the actual impact of global and local disturbances on the circuit; by introducing an error correction mechanism into the method, the voltage margin, which is reserved in the main circuit design stage for overcoming the adverse impacts of process deviation, supply voltage fluctuation, temperature variation, and ambient noise, etc., can be further released, so that the operating voltage can be regulated dynamically and thereby the power consumption can be optimized.
A DVFS technique based on on-chip monitoring boils down the operating conditions of a circuit (e.g., temperature, process, and noise variations) to timing variations of the circuit, and monitors the timing variations of the circuit in real time by on-chip monitoring, so as to instruct the circuit to regulate the operating parameters dynamically. The voltage or frequency margin reserved in the circuit design against the worst case can be reduced as much as possible so as to obtain a maximum power consumption reduction benefit only after the lowest operating voltage that meets the system performance requirement is found.
There is a risk of system errors when the lowest voltage point at any time of system operation is sought dynamically. Hence, an appropriate error recovery mechanism must be set to help the system recover from its error state in case of any system error. There are mainly two error recovery approaches used in China and foreign countries: local error recovery and global error recovery.
Local error recovery is to suspend the clock signal of the circuit for a cycle by means of clock gating and replace the error signal output with a correct signal in that period in case a timing error is detected by an on-chip monitoring unit in the circuit. For all errors occurring in different stages of the pipeline in a same cycle, the system can be recovered in one suspended clock cycle; however, for errors occurring in different cycles, the clock signal must be suspended immediately for error recovery when such an error occurs. The on-chip monitoring unit that employs such an error recovery approach is complex in structure, and the power consumption of the monitoring unit itself is high; in addition, if circuit errors occur frequently owing to the operating conditions such as operating voltage, frequency, and temperature, etc., for each clock cycle that involves errors, the CPU clock has to be suspended for a cycle to wait for error signal recovery. Consequently, the cost of error recovery is high, the system throughput is severely compromised, and the power saving effect is not satisfactory.
A global error recovery approach is usually used in the design of pipeline architecture, and it also utilizes an on-chip monitoring unit. Different from an local error recovery approach, in this recovery approach, all errors occurring in the same cycle boils down to one error, and when the on-chip monitoring unit detects an timing error, the error correction is not carried out immediately, instead, it waits till all stages of operations that do not involve errors in the pipelines are executed, i.e., waits till the stages of operations that involve errors are executed to the final stage along the pipelines, and then recovers the errors by executing the instructions that involve errors again. While an instruction that involves errors is executed again, the instructions following that instruction are being executed again. Hence, in the global error recovery approach, the recovery from all errors in a pipeline cycle (consisting of multiple clock cycles required to fully fill the pipeline, i.e., N clock cycles, where, N is the number of pipeline stages) can be completed in one recovery operation. Such a recovery approach consumes N cycles in one run. In a case that the system error ratio is high and multiple errors occur in the same pipeline cycle, all these errors can be recovered in one global error recovery operation. Therefore, if the system error ratio is high, the global error recovery approach has smaller impact on the system throughput and has a better power consumption reduction effect; however, if the system error ratio is low, the cost of recovery is high and the power consumption reduction effect is not significant.
At present, the recovery of DVFS circuit solely utilizes one of the approaches described above, and its system applicability is very limited. For applications that require a wide frequency range and involve high error ratio variations, it is difficult to attain optimal system throughput and power consumption in single error recovery approach.