Parallel processing generally refers to the concept of dividing one or more computational tasks into two or more subtasks, each of which may be executed on, or using, a separate processor. In other words, each of two or more processors may be configured to execute one or more subtasks of a larger computational task. Through the use of such parallel processing techniques, the original computational task may be, in many cases, completed in a faster and more efficient manner than would be possible using just one of the processors.
In practice, however, a number of obstacles may exist which may make it difficult or impossible to perform parallel processing of a given computational task, particularly for certain types or classes of computational tasks. For example, there is typically at least a minimal amount of computational overhead associated with parallel processing. For example, for a given computational task that is to be executed in parallel, it may be necessary to copy some or all of the data related to the computational tasks to each of the processors to be used. More generally, it may be appreciated that non-trivial processing resources may initially be required to split or divide the original computational tasks for parallel processing of subtasks thereof using two or more processors. Further, a delay or difficulty at any one of the processors executing in parallel may result in a delay of the computation of the task as a whole. Moreover, as the subtasks are completed at the two or more processors, computational resources may be required to join the results of the parallel processing performed at each of the two or more processors, so as to obtain a unified computational result for the computational task as a whole. Thus, as a result of such computational overhead which may be associated with the division, processing, and unification of processing subtasks during parallel processing thereof, it may be impractical to utilize parallel processing techniques in many circumstances.
For example, certain types of computational tasks may require a comparison or other consideration of each element of a relatively very large dataset with each element of a relatively smaller dataset. For example, in a specific example for the sake of illustration, it may occur that a dataset including 3 million records, each having 300 attributes, may be required to be compared with each of 100 records of a second dataset (such as when, for example, it is desired to group each of the 3 million records into one of 100 clusters which is deemed to be most similar). Consequently, such a calculation would require 3 million by 300 by 100 individual calculations. Moreover, it would not be feasible to divide the datasets for processing using separate processors, because the nature of the calculation is to compare all the records and attributes of the first, larger dataset with each and every element of the second, smaller dataset. Consequently, it may be impractical or impossible to obtain appreciable benefits from the use of parallel processing techniques in the context of these and other types of computations.