The invention relates generally to the field of digital computers and more specifically to functional units for processing predetermined types of instructions. The invention particularly provides a circuit or functional unit for use in connection with execution of an instruction for rearranging bits of a data word in accordance with a mask.
Computers process data in accordance with instructions. One type of instruction which has been proposed is a so-called xe2x80x9csheep and goatsxe2x80x9d instruction which accepts as operands a data word and a mask word and rearranges the bits of the data word in accordance with the mask word. In the rearranged data word, the bits of the data word in bit positions which correspond to bits of the mask which are clear, or have the value xe2x80x9czero,xe2x80x9d are shifted to the xe2x80x9cleftxe2x80x9d end of the rearranged data word with their order being preserved, and the bits of the data word in bit positions which correspond to bits of the mask which are set, or have the value xe2x80x9cone,xe2x80x9d are shifted to the right end of the data word with their order being preserved. For example, if an eight bit data word has the value xe2x80x9cabcdefghxe2x80x9d (where the letters represent binary integers having the value xe2x80x9conexe2x80x9d or xe2x80x9czeroxe2x80x9d), and the mask word corresponds to xe2x80x9c10011011,xe2x80x9d in the rearranged data word generated when the xe2x80x9csheep and goatsxe2x80x9d instruction is executed with these as operands, the bits xe2x80x9cb,xe2x80x9d xe2x80x9cc,xe2x80x9d and xe2x80x9cf,xe2x80x9d all of which are in bit positions for which the mask bits are clear would be shifted to the left, preserving their order xe2x80x9cbcf,xe2x80x9d and the bits xe2x80x9ca,xe2x80x9d xe2x80x9cd,xe2x80x9d xe2x80x9ce,xe2x80x9d xe2x80x9cg,xe2x80x9d and xe2x80x9ch,xe2x80x9d all of which are in bit positions for which the mask bits are set would be shifted to the right, preserving their order xe2x80x9cadegh,xe2x80x9d with the result being the rearranged data word xe2x80x9cbcfadegh.xe2x80x9d Essentially, the xe2x80x9csheep and goatsxe2x80x9d instruction results in a rearrangement of bits of a data word into two groups as defined by bits of a mask word, one group (the xe2x80x9csheepxe2x80x9d) corresponding to those bits for which the bits of the mask word are clear, and the other (the xe2x80x9cgoatsxe2x80x9d) corresponding to those bits for which the bits of the mask word are set, and in addition preservers order in each group
In a variant of the xe2x80x9csheep and goatsxe2x80x9d instruction, the bits of the rearranged data word in bit positions for which the bits of the mask are either set or clear (but preferably not both) will be set to a predetermined value. Generally, it has been proposed that, for example, the bits of the rearranged data word in bit positions for which the bits of the mask are clear will be set to zero, but the variant may be used with either the xe2x80x9csheepxe2x80x9d or the xe2x80x9cgoats,xe2x80x9d and the predetermined value may be either xe2x80x9conexe2x80x9d or xe2x80x9czero.xe2x80x9d
A xe2x80x9csheep and goatsxe2x80x9d instruction can find utility in connection with, for example, performing various bit permutations, for example, using a mask consisting of alternating set and clear bits will result in a so-called xe2x80x9cunshufflexe2x80x9d permutation of a data word. In addition, the variant can be useful in connection with using a set of originally discontiguous bits to perform a multi-way dispatch, or jump, by making the bits contiguous and using the result to form an index into a jump table.
The invention provides a new and improved circuit or functional unit for use in connection with execution of an instruction for rearranging bits of a data word in accordance with a mask.
In brief summary, the invention provides a system for providing, from an input data word comprising a plurality of input data units having an input arrangement and a mask word comprising a plurality of mask bits each associated with one of the data units, an output data word in which the data units are arranged according to the mask bits. The system includes a bit balancer module and a plurality of rearrangement modules. The bit balancer module is configured to divide the input data units comprising the input data word into a plurality of data word portions, each data unit being assigned to one of the data word portions based on a pattern of mask bits of the mask word relative to the mask bit associated with the respective data unit. Each rearrangement module is configured to provide, from one of the data word portions and associated mask bits, an output data word portion in which the data units are arranged according to the mask bits. The data units of the output data word portions provided by the rearrangement modules are interleaved to provide the output data word.