In the past years, energy consumption of computers has taken a central role in the design of integrated chips (ICs) and specifically microprocessors. Temperature is directly related to energy dissipated, which in turn is directly proportional to the power consumption within a time interval. Temperature levels mandate cooling rates and packaging technology choices. For example, in recent years companies have been considering fluid cooling subsystems for their high power dissipation machines. Other aspects related to the power dissipation and temperature levels are the reliability of the IC and cost of packaging.
Reliability of the IC is dependent on how much and how often the IC is heated. In an IC production line, the ICs in the testing phase are subject to certain stresses to identify the unreliable ones and exclude them as they are below a predefined quality standard. Usually temperature stresses are used to accelerate testing failures. Many failure mechanisms can be accelerated by one of the following methods:                Temperature Acceleration        Voltage and Current Acceleration        Humidity Acceleration        
Temperature Acceleration and Voltage and Current Acceleration methods are the most impacting in terms of power dissipation perspective. In particular, the relation of reliability to temperature exhibits an Arrhenius behaviour and Voltage and Current Acceleration is important in relation to maximum power consumption. This is because power identifies the instantaneous voltage current product and high current density is the main factor causing electro-migration in metal lines in the IC which results in IC failure. These failure mechanisms persist in normal operation as well.
The relation between temperature and power dissipation is based on the packaging technology and the cooling system used with that package. If the selected package cannot dissipate the heat generated within the IC at an acceptable rate, a temperature rise is generally observed. Once this rate is surpassed, thermal runaway occurs in the IC and it permanently fails. If reaching that rate value can be avoided, this would increase the reliability of the device and lower the package cost as well. This is particularly of importance for embedded microprocessor applications, where there are space and power constraints which cannot be violated.
Power consumption has leakage, static and dynamic components. The most prevalent one in CMOS technology is dynamic power consumption. There is no static power consumption in traditional CMOS circuits, but leakage is gaining more and more importance as CMOS process technology is further shrunk.
Power consumption and/or dissipation are characterized by two metrics peak and average power. Average power is a metric that affects energy consumption, and is dependent to a great extent on the workload a machine is executing, while peak power has a very important effect on power delivery network design and reliability. If peak power is beyond a certain limit in an integrated chip (IC), the power delivery network will fail and will be permanently damaged, causing IC functional failure.
The circuit or IC in a computer having the most important power consumption is the microprocessor. Microprocessor design and fabrication mainly relies on CMOS technology. Since all applications are transformed into numbers, the microprocessor is always operating and manipulating them. This is the main cause of the high power consumption of the microprocessor. This also causes microprocessors to be the main crunching circuits in the whole machine.
To reduce power consumption and dissipation, two approaches are known: a hardware based approach and a software based approach. Certain software based approaches taken to solve the power consumption and dissipation problem rely on thermal management methods for reducing power consumption at several levels of the software stack.
A thermal management solution to the power consumption problem is provided in US20060107262. US20060107262A1 describes a thread scheduling technique for a multi-core processor relying on the compiler to classify the thread to complex/high power or simple/low power, and then schedule and distribute the threads to run on the different cores based on a criterion defined to reduce the power/thermal density. This solution is provided to reduce power consumption in a unique multi-core processor and is not adapted for multi-threaded processors in which more than one thread runs on each core.
Another approach to the power consumption problem is the one taken in US20030126476. US20030126476 provides a solution for superscalar processors. However, this solution is not adapted to multi-threaded or multi-core processors.
US20050278520 provides another solution based on temperature sensors placed on each processor in a distributed processing system. The information from the temperature sensors are used to schedule tasks of high characteristic values to the lowest temperature processor. Such solution is only adapted for clusters of processors and not adapted for threads.
Another solution described in US20056948082 provides a method that allows a program to make parameter changes to accommodate the temperature profile based on a notification event. However, these parameter changes can affect the hardware configuration (frequency or voltage) resulting in worse performance. Moreover, this solution does not provide the capabilities for maintaining the performance via scheduling and is not adapted to multi threaded or multi core systems.
Still another solution to the power consumption is the one described in US20097596430. According to this solution, a set of temperature indices are requested for each core in a multi-core system. These indices are used to schedule the workload to a processor core instead of another one. This solution uses the mix and match of different work loads to manage temperature and is not adapted to multithreaded systems.
There is accordingly a need for a task scheduling system and method that thermally manages a multithreaded multi-core machine.