1. Field of the Invention
This invention relates to the field of multithreaded processing techniques. Specifically, this invention is a method, apparatus, and computer program product for the execution of a thread-unaware routine that implements an operation and accesses an operation-dependent temporary data structure in a multithreaded computing environment.
Many computing environments are multi-threaded. Such environments simplify the development of programs that have multiple simultaneous independent sequences of control in both single- and multi-processor systems. One difficulty with developing such programs is that multiple threads may simultaneously execute a routine that uses an operation-dependent temporary data structure to perform an operation. The prior art routine coordinates the creation of the operation-dependent temporary data structure among the threads that invoke the routine. The prior art routine also monitors and controls the execution of the threads that access to the temporary data structure. This explicit thread control within the routine increases the complexity of the routine, limits the number of simultaneous threads that invoke the routine, and increases the routine""s overhead.
One use of multiple threads, is to partition a problem domain and then to assign each partition to a thread for processing. For example, this technique can be used by an image processing program to partition the image data of an image (each partition being termed a tile) and operating on each tile by its assigned thread. In this example, each thread invokes a routine that uses the same operation-dependent temporary data structure to process its assigned tile. The result of processing all the tiles is the same as if the complete image data was processed by a single thread except that the multiple threads allow concurrent processing. An example of such an operation in the image processing problem domain is graphically represented as shown in FIG. 1. One problem in the prior art is that as each thread invokes the routine, the routine must determine which thread has invoked it and if the operation-dependent temporary data structure has already been allocated for that thread.
FIG. 1 illustrates a xe2x80x98LUT image processing operation and dataxe2x80x99 graph, indicated by general reference character 100. The xe2x80x98LUT image processing operation and dataxe2x80x99 graph 100 is a directed acyclic graph (DAG) representing an operation performed on a source image 101 and resulting in a destination image 103. In this example the operation is a xe2x80x98LUT image processing operationxe2x80x99 105 that uses a xe2x80x98lookup tablexe2x80x99 107. The xe2x80x98lookup tablexe2x80x99 107 is an operation-dependent temporary data structure. The xe2x80x98lookup tablexe2x80x99 107 is temporary in that the xe2x80x98LUT image processing operationxe2x80x99 105 creates the table in memory (from other data accessible to the operation) to optimize performance of the LUT operation. Once the LUT operation is completed, the xe2x80x98lookup tablexe2x80x99 107 can be deleted. The xe2x80x98LUT image processing operationxe2x80x99 105 can be thread-aware to allow concurrent processing of tiles in the source image 101. To perform the same operation on each tile, each execution accesses the same operation-dependent temporary data structure. The xe2x80x98LUT image processing operationxe2x80x99 105 is thread aware in that it detects its first invocation by a thread and creates the operation-dependent temporary data structure. The xe2x80x98LUT image processing operationxe2x80x99 105 also detects subsequent thread invocations and uses the previously created operation-dependent temporary data structure. If the xe2x80x98LUT image processing operationxe2x80x99 105 routine is also called to process another image while the source image 101 is being processed it generally must create another operation-dependent temporary data structure for that image. Thus, to allow multiple images to be processed, the xe2x80x98LUT image processing operationxe2x80x99 105 must also have the capability to associate the appropriate operation-dependent temporary data structure with each thread. Thus, during execution of the routine by each thread, the routine must account for, and manipulate, the threads used to parallelize its execution; and associate the operation-dependent temporary data structure with the thread. These overheads reduce the routine""s performance. One skilled in the art will understand that temporary data structures can also be used by other graphical or non-graphical operations.
Another drawback of using thread-aware routines which access operation-dependent temporary data structures is that including the thread-aware programmed logic complicates the routine""s programming. For example, detecting when all threads have completed processing (so as to be able to determine when to delete the operation-dependent temporary data structure) increases the routine""s complexity. In addition, the number of threads supported by the routine limits the number of threads available to parallelize execution of the operation.
For these reasons, it would be advantageous to provide a mechanism that enables a thread-unaware routine (that accesses an operation-dependent temporary data structure to perform its operation) to be executed by a plurality of threads.
The present invention provides an apparatus, method and computer program product for invoking a thread-unaware routine in a multi-threaded environment. A thread-unaware routine is a routine that does not contain programmed logic to manage or manipulate threads in preparation for an access to operation-dependent temporary data structure used by the routine. One aspect of the invention includes a computer controlled method for invoking an operation routine from within a multi-threaded computing environment. The operation routine implements an operation and accesses an operation-dependent temporary data structure. The operation routine is executed by a plurality of threads. The computer controlled method invokes a pre-process routine that prepares the operation-dependent temporary data structure. The computer controlled method also invokes the operation routine by one or more of the plurality of threads to perform the operation using the operation-dependent temporary data structure. The operation routine is thread-unaware.
Another aspect of the invention includes an apparatus for invoking an operation routine from within a multi-threaded computing environment. The apparatus has a central processing unit (CPU) and a memory coupled to the CPU. The operation routine implements an operation, accesses an operation-dependent temporary data structure and is executed by a plurality of threads. The apparatus includes a pre-process invocation mechanism that is configured to invoke a pre-process routine. The pre-process routine prepares the operation-dependent temporary data structure for access by the operation routine. The apparatus also includes an operation invocation mechanism that is configured to invoke the operation routine by one or more of the plurality of threads to perform the operation using the operation-dependent temporary data structure allocated by the pre-process invocation mechanism. The operation routine is thread-unaware.
Yet a further aspect of the invention is a computer program product that includes computer readable code, embedded in a computer usable storage medium, for causing a computer to invoke an operation routine. The operation routine implements an operation and accesses an operation-dependent temporary data structure, from within a multi-threaded computing environment. The operation routine is executed by a plurality of threads. When executed on a computer, the computer readable code causes a computer to effect a pre-process invocation mechanism and an operation invocation mechanism. Each of these mechanisms have the same functions as the corresponding mechanisms for the previously described apparatus.
These and other features of the invention will become apparent when the following detailed description is read in combination with the accompanying figures.