Parallel processor architectures are commonly used to perform a wide array of different computational algorithms. An example of an algorithm that, is commonly performed using such architectures is a scan operation (e.g. “all-prefix-sums” operation, etc.). One such scan operation is defined in Table 1.
TABLE 1[1, a0, (a0 ⊕ a1), . . ., (a0 ⊕ a1 ⊕ . . . ⊕ an−1)],
Specifically, given an array [a0, a1, . . . , an-1] and an operator “⊕” for which “I” is an identity element, the array of Table 1 is returned. For example, if the operator “⊕” is an addition operator, performing the scan operation on the array [3 1 7 0 4 1 6 3] would return [0 3 4 11 11 15 16 22], and so forth. While an addition operator is set forth in the above example, such operator may be any associative operator of two operands.
Furthermore, the scan operation may be an exclusive scan operation (as shown, in Table 1) or an inclusive scan operation. The exclusive scan refers to a scan where each element j of a result is the sum of all elements up to, but not including element j in an input array. On the other hand, in an inclusive scan, all elements including element j are summed.
To date, there is a continued to need to more efficiently perform computational algorithms such as scan operations using parallel processor architectures.