The present invention describes a method and apparatus for performing shift operations on packed data. Modern microcomputers and microcontrollers provide a data width of 32 bit, 64 bit or even more. In a lot of applications the processed data is still only 8 bit wide. Therefore, the above mentioned 32 bit- (or higher) microprocessors provide so called packed data instruction. These packed data instructions handle the content of a 32 bit- or a 64 bit-register differently depending on the data size. For example, if the data size indicates 8 bit packed data, a 32 bit word is split into 4 eight bit data parts, or a 64 bit word into 8 eight bit data parts which are usually processed independently by a processing unit. If the data size indicates 16 bit packed data, a 32 bit word is split into 2 sixteen bit data parts, etc. The processing unit usually comprises the respective number of independent units to process the respective parts of a packed word independently. The independent parts of the result of such a process are then stored, e.g. in another register, again as packed data.
U.S. Pat. No. 5,666,298 describes such an apparatus and the associated method for a shift instruction performed on packed data. FIG. 8 of U.S. Pat. No. 5,666,298 shows a plurality of independent working shift units to perform, e.g. up to eight shift operations independently on a 64 bit byte-packed data word. These plurality of units demand a certain amount of silicon space which in highly integrated devices not always is available.