This invention relates to loop handling operations over an array of data items in a single instruction multiple datapath (SIMD) processor architecture.
Parallel processing is an efficient way of processing an array of data items. A SIMD processor is a parallel processor array architecture wherein multiple datapaths are controlled by a single instruction. Each datapath handles one data item at a given time. In a simple example, in a SIMD processor having four datapaths, the data items in an eight data item array would be processed in each of the four datapaths in two passes of a loop operation. The allocation between datapaths and data items may vary, but in one approach, in a first pass the first data item in the array is processed by a first datapath, a second data item in the array is processed by a second datapath, a third data item is processed by a third datapath, and a fourth data item is processed by a fourth datapath. In a second pass, a fifth data item is processed by the first datapath, a sixth data item is processed by the second datapath, a seventh data item is processed by the third datapath, and an eighth data item is processed by the fourth datapath.
Problems may occur when the number of data items in the array is not an integer multiple of the number of datapaths. For example, modifying the simple example above so that there are four datapaths and an array having seven data items, during the second pass, the fourth datapath does not have an element in the eighth item of the array to process. As a result, the fourth datapath may erroneously write over some other data structure in memory, unless the fourth datapath is disabled during the second pass.
One way of avoiding such erroneous overwriting is to force the size of the array, i.e., the number of data items contained within the array, to be an integer multiple of the number of datapaths. Such an approach assumes that programmers have a priori control of how data items are allocated in the array, which they may not always have.
Typically, each datapath in a SIMD processor has an associated processor enable bit that controls whether a datapath is enabled or disabled. This allows a datapath to be disabled when, e.g., the datapath would otherwise overrun the array.
In a general aspect, the invention features a method of controlling whether to enable one of a plurality of processor datapaths in a SIMD processor that are operating on data elements in an array, including determining whether to enable the datapath based on information about parameters of the SIMD processor and the array, and a processing state of the datapaths relative to the data items in the array.
In a preferred embodiment, the information includes an allocation between the data items and a memory, a total number of parallel loop passes in a loop processing operation being performed by the datapaths, a size of the array, and a number of datapaths (i.e., how many datapaths there are in the SIMD processor). The processing state is a number of remaining parallel passes of the datapaths in the loop processing operation.
The allocation between the data items and the memory may be unity-stride, contiguous or striped-stride.
In another aspect, the invention features a computer instruction including a loop handling instruction that specifies the enabling of one of a plurality of processor datapaths during processing an array of data items.
In a preferred embodiment, the instruction includes a parallel count field that specifies the number of remaining parallel loop passes to process the array, and a serial count field that specifies the number of serial loop passes to process the array.
In another aspect, the invention features a processor including a register file and an arithmetic logic unit coupled to the register file, and a program control store that stores a loop handling instruction that causes the processor to enable one of a plurality of processor datapaths during processing of an array of data.
Embodiments of various aspects of the invention may have one or more of the following advantages.
Datapaths may be disabled without having prior knowledge of the number of data items in the array.
The method is readily extensible to a variety of memory allocation schemes.
The loop handling instruction saves instruction memory because the many operations needed to determine whether to enable or disable a datapath may be specified with a simple and powerful single instruction that also saves register space.
The loop handling instruction saves a programmer from having to force the number of data items in the array of data items to be an integer multiple of the number of datapaths.