1. Field of the Invention
The present invention relates to microprocessor architecture, and in particular, to a method and structure for performing saturation operations using the arithmetic logic unit of a microprocessor.
2. Discussion of Related Art
In a microprocessor, the width of the data that can be handled is generally determined by the width of the data path in the arithmetic logic unit (ALU) of the microprocessor. For example, a 32-bit microprocessor with a 32-bit wide ALU usually performs operations like addition, comparison, etc. on 32-bit wide data. However, a microprocessor may also include a set of instructions that is designed to operate on data restricted to bit widths less than the full data path width. For example, the 32-bit microprocessor may include some instructions to operate on 16-bit data.
Such a microprocessor capable of handling different data widths usually provides some means to convert data between these different data widths. Extending a “reduced width” data value to “full width” (e.g. extending 16-bit data to 32-bit) is trivial, and does not require any special provision. However, reducing the data width (e.g. reduction from 32-bit to 16-bit) is nontrivial, and the microprocessor could provide a saturation circuit for this purpose.
In a saturation operation, a data value is compared with saturation threshold values. If the comparison indicates that the data value is outside the allowable data range for the reduced data width, the data value is replaced with the saturation threshold value. If the data value is within the allowable range, it is not altered.
In an unsigned saturation operation, the data value is checked against a single (upper) saturation threshold value. If the data value is greater than the upper saturation threshold value, the data value is replaced with the upper saturation threshold value. For example, consider a reduced data width of 16-bit in a 32-bit microprocessor. The maximum allowable unsigned data value for the 16-bit data width is 216−1 (i.e., a 16-bit number with each bit value equal to 1). For clarity and conciseness, binary value 216−1 can be expressed as a hexadecimal value, i.e., 0x″0000_FFFF″. The value 216−1 therefore can be used as the upper saturation threshold value. A saturation operation would then replace any data value greater than with 216−1 with 216−1, whereas data values not greater then 216−1 pass unchanged.
In a signed saturation operation, the data value is checked against both a positive upper saturation threshold value and a negative lower saturation threshold value. If the data value is greater than the upper saturation threshold value or less than the lower saturation threshold value, the data value is replaced with the upper or lower saturation threshold value, respectively. For example, returning to the reduced data width of 16-bit, the maximum allowable positive data value would be 215−1 (0x″0000—7FFF″). The minimum allowable negative data value would be −215 (0x″FFFF—8000″). Any data value greater than 215−1 would then be replaced with 215−1, while any data value less than −215 would be replaced with −215. Any data value in between these two thresholds remains unchanged.
In a microprocessor, a standard arithmetic logic unit (ALU) typically includes adder logic for performing basic arithmetic operations and a single general-purpose min/max comparator for comparing data values and selecting a minimum or maximum. To implement a saturation instruction in a standard ALU, additional saturation-specific logic is typically required.
FIG. 1a shows a conventional saturation-capable ALU 100a, which comprises an n-bit min/max comparator 110 and an adder circuit 120a. ALU 100a is coupled to receive n-bit input words A[n−1:0] and B[n−1:0]. A min/max comparator circuit such as min/max comparator 110 drives an “equality” signal EQ, a “less than” signal LT, or a “greater than” signal GT to an active state if input word A[n−1:0] is equal to, less than, or greater than, respectively, input word B[n−1:0], and also determines the minimum or maximum of the two input words and provides the result as an n-bit output word Z[n−1:0]. Meanwhile, adder circuit 120a performs various arithmetic operations on n-bit input words A[n−1:0] and B[n−1:0], according to the controlling instruction set, and generates an n-bit output word Y[n−1:0].
To perform saturation operations, adder circuit 120a includes a saturation module 121. If ALU 100a is working on reduced-width data, n-bit input words A[n−1:0] and B[n−1:0] would represent m-bit data values, where m is less than n. Saturation module 121 would then compare word A[n−1:0] or word B[n−1:0] to a saturation threshold value T_sat to determine whether or not the limits of the reduced data width has been exceeded.
Because adder circuit 120a may already include some saturation functionality (e.g., saturated addition, etc.), merging saturation module 121 into the adder data path may provide some degree of layout efficiency. However, the overall speed of a microprocessor is typically determined by the logic depth of its adder data path. Therefore, incorporation of saturation module 121 into adder circuit 120a can have a negative impact on microprocessor performance.
FIG. 1b shows another example of a conventional saturation-capable ALU 100b. ALU 100b comprises an n-bit min/max comparator circuit 110, an adder 120b, and a dedicated saturation circuit 130. Comparator 110 is substantially the same as described with respect to FIG. 1a. However, unlike adder circuit 120a shown in FIG. 1a, adder circuit 120b does not include a saturation module. Instead, dedicated saturation circuit 130 executes all saturation instructions.
Because the saturation logic in ALU 100b is removed from the adder data path, efficient saturation operations can be performed. Furthermore, saturation circuit 130 would typically incorporate optimized logic that could carry out the saturation operations in an efficient manner. However, the addition of saturation circuit 130 as a separate functional block undesirably increases the area requirements of ALU 100b. This in turn can increase the manufacturing costs and power requirements for any microprocessor incorporating ALU 100b. 
Hence there is a need for a method or system to provide saturation capability in a microprocessor that minimizes circuit area requirements without degrading overall performance.