Not Applicable.
The present embodiments relate to processors, and are more particularly directed to improving the availability and implementation of three operand shift and/or merge instructions and operations in such processors.
The present embodiments pertain to the ever-evolving fields of computer technology, microprocessors, and other types of processors. Processor devices are used in numerous applications, and their prevalence has led to a complex and demanding marketplace where efficiency of operation is often a key consideration, where such efficiency is reflected in price and performance of the processor. The following discussion and embodiments are directed to processor efficiency and functionality, and arise in the area of shift-merge instruction capability.
The prior art includes a number of bit manipulation instructions where each such instruction is implemented in certain processors because it permits data to be manipulated using a single instruction, whereas if the instruction is not part of the processor instruction set the same resulting data manipulation may require considerably more than one instruction. To demonstrate these types of instructions, four different examples are provided below. Before detailing those instructions, FIG. 1 introduces the basic instruction format of all of these instructions via a general instruction 10. Instruction 10 includes an opcode, which includes a number of bits forming a unique bit pattern which defines the specific type of instruction. Instruction 10 further includes references to two data operands, shown as data D1 and data D2. These references are commonly to corresponding registers and it is not intended therefore to demonstrate that these data are directly embedded in instruction 10. Additionally, for the sake of discussion and as a contemporary example, data D1 and D2 are typically 32-bit quantities stored in the registers and often there are 32 such registers; as a result, the references to data D1 and D2 are 5-bit identifiers which each identify a corresponding one of the 32 registers in which either data D1 or data D2 is stored. Instruction 10 also includes one or more bit manipulation arguments where for the examples provided below there is either two 5-bit arguments for a total of 10 bits, or a single 5-bit argument. Different arguments are discussed below based on a particular corresponding instruction, but typically the arguments relate to some parameter for manipulating data D1 and D2 such as a shift amount, a position, or a number of bits to be manipulated. As explored in more detail below, note that the argument(s) may be either immediate information (i.e., embedded within instruction 10) or addressed by the instruction so that they are read from a storage device (e.g., register). Finally, note that instruction 10 also includes a destination reference DEST, where this reference is also commonly to one of 32 registers and, hence, is also a five bit identifier. The DEST location is the register where the result of the operation of instruction 10 is written.
FIGS. 2a and 2b illustrate the operands and operation of a prior art INSERT instruction. FIG. 2a illustrates the two 32-bit data operands of the INSERT instruction, and which are shown as data A and B. The third operand of an INSERT instruction is a bit manipulation operand which provides two aspects, and in this regard is typically embodied as a 10-bit operand, where five of these bits define a SHIFT argument and the remaining five of these bits define a LENGTH argument. The SHIFT argument defines the number of bits that data A is to be right shifted, that is, shifted so that its most significant bit is shifted towards the original position of its least significant bit. Thus, FIG. 2a illustrates the right shifting of data A in response to the SHIFT argument by way of a right-pointing arrow, with the result following the shift being designated as AS in FIG. 2b. For example, if SHIFT equals six, then data A is shifted right by six bits with the result, AS, starting at its least significant bit, having the 26 more significant bits from data A. Note that AS is shown in FIG. 2b only to demonstrate the functionality of the shift, and is not intended to suggest that an additional storage device or clock cycle is required to temporarily store the shifted value AS. The LENGTH argument defines the number of bits that are taken from AS (i.e., the shifted value of A) and copied over the value of data B starting at the least significant bit of data B; for sake of reference, the LENGTH number of bits from AS and copied in this manner are shown as ASL. Thus, FIG. 2b illustrates that a number of bits equal to LENGTH from AS are copied over data B, thereby creating a result R1 which includes a value ASL starting at bit 0 and continuing up to bit LENGTHxe2x88x921. The remaining bits in result R1 are identical to the corresponding bit locations from data B. Given the preceding, it may be stated that a number of bits equal to LENGTH from AS are merged with data B and, thus, this is why the INSERT ins is a type of shift-merge instruction.
FIGS. 3a and 3b illustrate the operands and operation of a prior art DEPOSIT instruction. FIG. 3a illustrates the two 32-bit data operands of the DEPOSIT instruction, and which are shown as data C and D. The third operand of the DEPOSIT instruction is a bit manipulation operand which provides two aspects and also is typically embodied as a 10-bit operand, where five of these bits define a SHIFT argument and the remaining five of these bits define a LENGTH argument. The SHIFT argument defines the number of bits that data C is to be left shifted, that is, shifted so that its least significant bit is shifted towards the original location of its most significant bit. Thus, FIG. 3a illustrates the left shifting in response to the SHIFT argument by way of a left-pointing arrow, with the result following the shift being designated as CS in FIG. 3b. For example, if SHIFT equals four, then data C is shifted left by four bits with the result, CS, starting at its least significant bit, having the 28 least significant bits from data C. Note that CS is shown in FIG. 3b only to demonstrate the functionality of the shift, and is not intended to suggest that an additional storage device or clock cycle is required to temporarily store the shifted value CS. The LENGTH argument defines the number of bits that are taken from CS (i.e., the shifted value of C) and copied over or xe2x80x9cmerged withxe2x80x9d the value of data D starting at bit location SHIFT and continuing, therefore, up to bit location SHIFT+LENGTHxe2x88x921; for sake of reference, the LENGTH number of bits from CS are shown as CSL. Thus, FIG. 3b illustrates that CSL is copied over the corresponding bit locations in data D, thereby creating a result R2 which includes a value CSL starting at bit SHIFT and continuing up to bit SHIFT+LENGTHxe2x88x921. The remaining bits in result R3 are identical to the corresponding bit locations from data C, and appear in both the upper and lower bit locations of result R2 (assuming SHIFT is greater than zero and less than 32).
FIGS. 4a and 4b illustrate the operands and operation of a prior art REPLACE instruction. FIG. 4a illustrates the two 32-bit data operands of the REPLACE instruction, and which are shown as data E and F. The third operand of the REPLACE instruction is a bit manipulation operand which provides two aspects and also is typically embodied as a 10-bit operand, where five of these bits define a POSITION argument and the remaining five of these bits define a LENGTH argument. The POSITION argument defines a bit position in data E, and the LENGTH argument defines a number of bits that are copied from data E starting at the POSITION bit. More particularly, these copied bits form a quantity shown in FIG. 4b as EL, and they are copied over the value of data F starting at the POSITION bit. Thus, FIG. 4b illustrates that EL is copied over the corresponding bit locations in data F, thereby creating a merged result R3 which includes a value EL starting at bit POSITION and continuing up to bit POSITION+LENGTHxe2x88x921. The remaining bits in result R3 are identical to the corresponding bit locations from data F, and appear in both the upper and lower bit locations of result R3 (assuming POSITION is greater than zero and less than 31).
FIGS. 5a and 5b illustrate the operands and operation of a prior art FUNNEL-SHIFT instruction. FIG. 5a illustrates the two 32-bit data operands of the FUNNEL-SHIFT instruction, and which are shown as data G and H. For the FUNNEL-SHIFT instruction, the two 32-bit operands are concatenated, as also shown in FIG. 5a. The third operand of the FUNNEL-SHIFT instruction is a bit manipulation operand which provides only a single aspect and is typically embodied as a 5-bit operand, where the five bits define a SHIFT argument. The SHIFT argument defines the number of bits that both data G and H are right shifted (i.e., so that the most significant bits of each are shifted towards the original location of their respective least significant bits). Thus, FIG. 5a illustrates the right shifting in response to the SHIFT argument by way of a right-pointing arrow, with a result R4 shown in FIG. 5b. Result R4 is a 32-bit result which includes the values of data G and H after the right shift, and designated as GS and HS, respectively. Further, note that the 32-bit result R4 of the FUNNEL-SHIFT instruction starts at its least significant bit position with the bit position of data G that is equal to the shift amount. For example, if SHIFT equals five, then data G is right-shifted five positions and, thus, bits G0 through G4 are shifted out such that GS in result R4 begins, at its least significant bit location, with bit G5 and includes the remainder of the bits from data G up to G31. Further, since data H is also right shifted, then HS in result R4 includes the bits of H from H0 up to H0+SHIFTxe2x88x921; again by way of example if SHIFT equals five, then HS includes bits H0 through H4.
The present inventor has made various observations given the operations and functionality provided by the preceding instructions, and these observations provide further introduction to the preferred embodiments described later. As a first observation, the 10-bit value for any of the INSERT, DEPOSIT, and REPLACE instructions, as well as the 5-bit value for the FUNNEL-SHIFT instruction, may be an immediate operand within each instruction. Alternatively, these values may be provided as read data, such as from a register or memory location. However, for either the approach of an immediate operand or the approach of a read value, there are drawbacks, as further detailed below.
When the 10-bit or 5-bit value for any of the INSERT, DEPOSIT, REPLACE, and FUNNEL-SHIFT instructions is provided by an external read (e.g., from a register file), this requires an additional read port on the device being read. More specifically, for the three operand instruction described above, the external read involves a first data operand, a second data operand, and the 10-bit (or 5-bit) value as a third operand, thereby requiring a total of three read ports. Such an additional port can be very expensive in terms of space and actual device cost. Typically, the cost of a register file tends to increase as the square of the number of read ports and, thus, an additional port for a third operand can be burdensome and potentially prohibitive in many processor implementations. Still further, a requirement for externally reading this third ported value requires an additional set of forwarding multiplexers between the register file and the circuits capable of reading the port. Finally, assuming that the external read is from a register file containing 32 registers (i.e., a common implementation), then the instruction must include a 5-bit field to address one of these 32 registers, thereby requiring five bit positions in the instruction to achieve this addressing functionality.
When the 10-bit or 5-bit value for any of the INSERT, DEPOSIT, REPLACE, and FUNNEL-SHIFT instructions is embedded in the instruction as an immediate value, then the instruction necessarily is increased in size, by either ten bits for the INSERT, DEPOSIT, and REPLACE instructions, or by five bits for the FUNNEL-SHIFT instruction. This number of bits can considerably increase the amount of opcode space required to accommodate the processor instruction set. Indeed, because of this potential additional opcode space, many processors do not include these bit-manipulation instructions.
In view of the above, there arises a need to address the drawbacks of the limitations of the prior art bit manipulation instructions and their functionality, as is accomplished by the preferred embodiments described in the remainder of this document.
In the preferred embodiment, there is a method of operating a processor. The method comprises a first step of fetching an instruction. The instruction includes an instruction opcode, a first data operand bit group corresponding to a first data operand (D1xe2x80x2), and a second data operand bit group corresponding to a second data operand (D2xe2x80x2). At least one of the first data operand and the second data operand consists of an integer number N bits. The instruction also comprises at least one immediate bit manipulation operand consisting of an integer number M bits, wherein 2M is less than the integer number N. The method further includes a second step of executing the instruction, comprising the step of manipulating a number of bits of one of the first data operand and the second data operand. Finally, the number of manipulated bits is in response to the at least one immediate bit manipulation operand, and the manipulating step is further in response to the instruction opcode. Other circuits, systems, and methods are also disclosed and claimed.