The present invention relates generally to image processing, and more particularly, to a reciprocal approximation circuit for image processing.
Computing applications including image processing and computer vision applications involve execution of various arithmetic operations, such as addition, multiplication, division, and normalization. These applications often require multiple arithmetic operations to be performed in succession on pixel values of various real-time digital images, such as live video feeds received from image acquisition devices. Typically, these operations are performed directly by hardware accelerators, with addition and multiplication operations being performed by adders and multipliers, and division and normalization operations performed by determining reciprocal values.
The pixel values of digital images often are represented as fixed width real numbers. The accuracy of the reciprocal values needs to match the fixed width. Various iterative methods, such as Newton-Raphson, typically are used to determine the reciprocal values. Mathematically, the Newton-Raphson method includes performing iterative approximations for determining the roots of equation (1),f(x)=0   (1)where, f(x) is a real-valued function.
The roots of equation (1) are determined by successively approximating a solution based on equation (2),
                              x                      i            +            1                          =                              x            i                    -                                    f              ⁡                              (                                  x                  i                                )                                                                    f                ′                            ⁡                              (                                  x                  i                                )                                                                        (        2        )            where,    xi is the solution obtained from a previous iteration;    xi+1 is the solution obtained from a current iteration;    f(xi) is the value of the function f(x), when x=xi;    f′(x) is the derivative of the function f(x); and    f′(xi) is the value of the derivative of the function f(x),    when x=xi.Accuracy of the solution is improved with every iteration. Therefore, the more iterations, the more accurate the solution.
Using the Newton-Raphson method, an approximate reciprocal value of an operand ‘A’ can be determined using equation (3):xi+1=xi×(2−Axi)  (3)where,    A is the operand for which the approximate reciprocal value is to be determined;    x is the reciprocal of the operand ‘A’;    xi is an approximate reciprocal value obtained from a previous iteration; and    xi+1 is an approximate reciprocal value obtained from a current iteration.
The Newton-Raphson method includes selecting an initial value ‘x0’ for the first iteration. The accuracy of the method is based on the selection of the initial value ‘x0’. For example, a first selection of the initial value ‘x0’ may require only two iterations for attaining 11-bit accuracy for the approximate reciprocal value, whereas a second selection may require four iterations to attain 11-bit accuracy. Hence, the selection of the initial value ‘x0’ is a critical parameter that affects the accuracy and convergence of the Newton-Raphson method, which in turn may affect the number of iterations required for attaining a desired accuracy. Conventionally, the initial value ‘x0’ is selected using a look-up operation.
FIG. 1 shows a conventional reciprocal approximation circuit 100 for determining an approximate reciprocal value of an operand ‘A’. The reciprocal approximation circuit 100 includes an initial value selection circuit 102 and an iteration circuit 104. The reciprocal approximation circuit 100 receives the operand ‘A’ from an external circuit (not shown). The initial value selection circuit 102 provides an initial value X_0 for determining the approximate reciprocal value of the operand ‘A’. The initial value selection circuit 102 includes a range selection circuit 106, a memory 108, and a multiplexer 110 (also referred to as a ‘mux’).
The range selection circuit 106 receives the operand ‘A’, and generates first and second select bits (collectively referred to as “select bits”) depending on the value of ‘A’. When ‘A’ is within a first range, the range selection circuit 106 generates the select bits ‘00’. When ‘A’ is within a second range, the range selection circuit 106 generates the select bits ‘01’. When ‘A’ is within a third range, the select bits are ‘10’, and when ‘A’ is within a fourth range, the select bits ‘11’.
The memory 108 stores four preset, initial values for determining the approximate reciprocal value of ‘A’, which are associated with the four ranges used by the range selection circuit 106. The select bits generated by the range selection circuit 106 are input to the mux 110 to select one of the four initial values stored in the memory 108. The mux 110 outputs the selected one of the four initial values as the initial value X_0.
The iteration circuit 104 receives the operand ‘A’, and the initial value X_0, and executes the first iteration of the Newton-Raphson method for generating a first approximate reciprocal value of ‘A’. The iteration circuit 104 includes a first multiplier 112, a complement circuit 114, and a second multiplier 116.
The first multiplier 112 receives ‘A’ and X_0, and generates a first multiplication output Y_1. The complement circuit 114 receives Y_1 and generates a complement thereof as output Y_2 (e.g., Y_2 is a 1's or 2's complement of Y_1). The second multiplier 116 receives Y_2 and multiplies it with the initial value X_0 to generate a product X_1, which represents the first approximate reciprocal value of the operand ‘A’. To improve the accuracy of X_1, the iteration circuit 104 can be cascaded with other iteration circuits.
The initial value selection circuit 102 and the iteration circuit 104, including the two multipliers 112 and 116, and the ones complement circuit 114 make the reciprocal approximation circuit 100 very bulky. Therefore, when implemented on an integrated circuit, the reciprocal approximation circuit 100 consumes a large area, which is undesirable. Thus, it would be advantageous to have a reciprocal approximation circuit that consumes less area.