Microprocessor vendors are offering multi-core microprocessors that allow parallel or concurrent execution of code. Compilers providing parallelization functions transform code written for sequential execution into code capable of being executed in parallel by the different cores of the multi-core microprocessor. Compilers have been developed to parallelize code used for scientific application. However, for non-scientific applications, programmers often rewrite the code and include directives to cause different sections of the code to be executed in parallel.
The process of the programmer modifying the code to include directives and statements for parallel execution is a complex, time consuming, and error prone task. For this reason, language extensions have been developed to assist the programmer in parallelizing code initially written for sequential execution. One technique developed for parallelizing code to account for issues such as loop-carried dependences, irregular memory accesses and arbitrary control flow is the taskqueuing model. According to this model, the programmer inserts task queuing pragmas into the code to cause the code outside the task pragmas to be executed sequentially in one thread. When this thread encounters a task pragma, it enqueues the task into the task queue, including a copy of any captured variables. Other threads operate by dequeuing tasks from the queue and executing the part of the code inside the queued task pragma.
Although the taskqueuing model and programming extensions enable programmers to add parallelism to application source code, identifying opportunities in the code for parallelism and correctly coding the parallel directives (e.g. shared vs. privatized variables) still takes significant programmer time and effort. The complexity is especially a problem for general applications due to their higher complexity, larger code size, and less regular nature.