In many computer systems including mobile electronic devices such as mobile phones and tablets, Symmetric Multiprocessing (SMP) are common. An SMP system is a computer hardware architecture where multiple identical processors, sometimes called cores in an SMP system, are connected to a single shared main memory. Since the processors are identical the system may have a single instance of an Operating System (OS) with a scheduler scheduling a number of tasks on the identical processors. If load balancing between the processors is disregarded, it does not matter which processor the scheduler schedules a task on since the processors are identical.
Power consumption is a very important characteristic for mobile electronic devices. Hence different strategies to save power have evolved. For example Dynamic Voltage Frequency Scaling (DVFS) where the voltage and the frequency for a processor is changed dynamically in run-time allow the system to decrease voltage and frequency in order to save power when less performance is needed by a system. As an example, a user of a mobile electronic device is only reading email which is typically a task that does not require much performance from the system. Central Processing Unit (CPU) hotplug is another example of a power save technique for SMP systems where a processor may be powered off completely.
To meet the demand for even higher-performance mobile platforms and at the same time being power efficient when the user is performing tasks that require less performance, heterogeneous multi-core systems where high performance but less power efficient processors are paired with smaller, power efficient processors delivering less performance have been investigated. An example of such a system is the company ARM's big.LITTLE.
The first big.LITTLE system from ARM uses a “big” Cortex-A15 processor, which is a high performance processor, paired with a “LITTLE” Cortex-A7 processor, which is a power efficient processor. Both the smaller Cortex-A7 and the larger Cortex-A15 use the same instruction set and binary code built for the Cortex-A7 may execute on the Cortex-A15 and the other way around. There may be different variations of how many Cortex-A15 processors and how many Cortex-A7 processors that are used. A common configuration may be to have two Cortex-A15 processors paired with two Cortex-A7 processors.
Even if an instruction set is the same between the different processors in a heterogeneous multi-core system the micro-architecture may be very different. Examples of characteristics that often differ between the processors are instruction and data cache sizes, length of pipeline, branch prediction characteristics, if the processor may execute instructions out-of-order or not and a number of other digital circuits in the processors. For example a big processor might have several Arithmetic and Logic Units (ALUs) and Floating Point Units (FPUs) while the small processor might only have one. Another difference may be number of entries in the Translation Lookaside Buffer (TLB).
Today many runtime environments use Just-in-Time (JIT) compilation where code is compiled to native machine code in run-time just before it is executed on a processor. Such environments are for example Google's Android for mobile devices where all applications are compiled to native machine code in runtime by the Dalvik Virtual Machine (VM) most JavaScript implementations used in client side web browsers such as Google Chrome or Firefox both for PCs, laptops, Android smartphones and tablets, and Microsoft's .NET Framework.
A strategy that a JIT compiler may use to balance the requirement for short compilation and at the same time do optimizations so the code may run faster is to first compile all the code as quickly as possible without spending too much time trying to do any optimizations. Then when the compiled code is executed the binary is profiled to identify some hot parts of the code that are executed frequently and compile these hot parts again but this time trying to optimize the code more. An example of this is the latest version of Google's V8 JavaScript engine (http://blog.chromium.org/2010/12/new-crankshaft-for-v8.html) which is used in e.g. Android on ST-Ericsson, Qualcomm, TI, Samsung and other chipset vendor's products. Here JavaScript code is first compiled as quickly as possible without doing too much optimizations. Then the binary is profiled in run-time to identify which parts of the code that are executed many times. These portions are often referred to as hot spots. Then the V8 JavaScript engine performs a second compilation pass and this time it spends extra time trying to optimize the hot spots.