A thread is an independent block of executable instructions that is capable of being processed in duplicate and in parallel. Software may be developed such that it is threaded, meaning that it may be processed in duplicate and in parallel. Parallel processing is particularly useful when there is a large amount of independent processing occurring within a threaded application or when there is a large amount of data that a threaded application processes. Parallel processing techniques permit more efficient use of processors and memory and provide improved processing throughput in multi-processor architectures.
Some example areas where parallel processing has been particularly beneficial and deployed within the industry include graphics and multimedia processing. These areas generally consume voluminous amounts of data and much of that data can be independently processed or independently rendered to generate the desired output.
Typically, threaded applications use a module which controls the processing flow of multiple threaded applications, which may be simultaneously processing. That is, a module determines when some threads have finished their processing and when certain other threads should be initiated for processing. The module is tightly coupled to the threaded applications which it manages. This means that the module retains processing logic in order to identify and communicate with the threads that it is managing. Thus, if other circumstances within a multi-processor environment alter a processing location for a particular thread, the module will have processing logic to be notified of this changed location. As a result, the module may become unruly and may need regular adjustments and maintenance in order to efficiently manage its threaded applications within a multi-processor environment.
Additionally, in many applications (e.g., graphics, multimedia, digital signal processing, numerical computations, physical modeling, artificial intelligence, etc.) there may be inherent data dependencies which may limit the amount of achievable thread parallelism. This is particularly a problem for some multi-media applications, where data units may be large, and only a small finite number of independent threads may be found because of the volume of identified data dependencies. These data unit dependencies may also be multi-dimensional and complex, such that conventional parallel processing techniques offer little processing throughput improvement even when parallel processing is used because only a few independent threads are identifiable and processed in parallel. In fact, high the overhead and cost associated with thread communication are common problems in present parallel architectures and approaches.
Therefore, there is a need for improved thread to thread communication for parallel processing techniques.