Power consumption is a limiting factor for high-performance computing (HPC) system performance. Better energy efficiency may be achieved by using parallel processing. However, many approaches based on scaling up symmetric multiprocessing (SMP) designs are unable to scale up energy efficiency and performance due to the overheads of complex cores and expensive mechanisms used to maintain cache coherence.