Conventional computer graphics systems require fast and accurate power calculations when performing certain computational operations. For example, computer-generated image rendering may depict an image attribute, such as illumination and attenuation of a 3-D light source or spotlight, by converting a power value into an approximation of a mathematical function of that value. A power value of an attribute D may have a range from zero (no light) to one (full light). A feature of the attribute D, which may be attenuation over a rendered object to account for effects like specular reflection, for example, may be represented as a mathematical function of D, such as a logarithm, base 2, of D, or any other mathematical function that has a naturally smooth curve.
One way of doing a power calculation that is easily implemented in hardware or software is to calculate the antilog of:
P*log2(D)
where D is a multiple-bit digital representation of an attribute within a range between zero and one, and P is a multiplicative power factor of the logarithm of D. The attribute D is quantized and approximated by an n-bit integer. A digital binary signal is produced. Each additional bit used for quantizing D increases the number of quantization levels, however each additional bit used also increases the memory area required for storing log2(D). By converting an attribute into a logarithmic function, computational processes performed on the attribute are simplified.
In order for the result to be as accurate as possible, the value of log2(D) must be calculated as accurately as possible using all of the input bits of D. The most efficient method of providing a logarithm calculation of D is with a memory look-up table (LUT) converter that accepts a binary multiple-bit value of D as an input data signal, and in response outputs a binary multiple-bit value of the logarithm of D.
Implementing the calculation in this way sacrifices accuracy for efficiency. The output of a LUT will yield a log2(D) value that is a step-wise approximation of a naturally smooth logarithmic curve, especially in the range of 0.5xe2x89xa6Dxe2x89xa61. For example, when D=0.5, the magnitude of log2(D) is 1. As D approaches 1, the magnitude of log2(D) gradually decreases towards zero, and the curve begins to increasingly flatten. By representing log2(D) with a 10-bit word, the maximum absolute error between the actual log curve and a digital approximation is negligible when D is between 0.0xe2x89xa6Dxe2x89xa60.5, but becomes visibly more pronounced in the ultimately rendered image when 0.5xe2x89xa6D, and is largest when 0.708xe2x89xa6Dxe2x89xa61.0. Thus, even with a large number of bits, the output curve of the LUT is unacceptably coarse. Hence more than 10 bits accuracy of log2(D) is required.
Another source of error depends on the number of bits employed to represent D. The maximum absolute error between using a 12-bit D and the full 13-bit D to calculate log2(D) is only 2 least significant bits (LSB) for the entire range of D, with 1 LSB being the additional bit throughout the range. The error in the value of log2(D) between a 12-bit D and a 13-bit D may be only 2 LSBs, however, though the absolute error is small, the relative percentage error is larger as D approaches 1, log2(D) approaches 0, and after multiplication by P and anti-log ROM look-up, the absolute error is magnified. Thus, where a 12-bit D value may xe2x80x9cbreakxe2x80x9d out from a smooth value for log2(D), the fall 13-bit D value will improve the absolute error by several LSB values. But using just one extra bit for D doubles the required memory space in the LUT for approximating the mathematical function of D.
Schemes for smoothing the step-wise gradient transitions of an approximation for log2(D) exist, however there is a large overhead cost primarily in the amount of memory area required for each incrementally small improvement. Naturally, increasing the number of addressable entries in the LUT memory would yield a smoother output curve, however it would also boost the required memory space exponentially.
A one-to-one mapping of a 13-bit D to a 12-bit log2(D), assuming an implicit xe2x80x9c1xe2x80x9d in the first significant bit of D in the range of 0.5xe2x89xa6Dxe2x89xa61.0, would require a memory with 2(13-1), or 212=4096 entries, each entry being 12-bits wide. This is roughly a 50 kbit ROM which is a undesirably large.
Another implementation to smooth the rendering image attributes uses a 1-stage correction in a LUT system. The input value of D is represented as a 13-bit word. To decrease the size of the ROM by half, the last of the 13 bits is unused in the address to the LUT converter. Of the remaining 12 bits, the first bit is an implicit xe2x80x9c1xe2x80x9d in the range of interest of 0.5xe2x89xa6Dxe2x89xa61.0, and 10 bits are provided as an address to the LUT memory having 210=1024 entries, each being 14 bits wide. The twelfth bit, b12, is used as a correction bit. A 14-bit output is divided into a 12-bit approximation of a mathematical function and a 2-bit delta signal. If correction bit b12 is a xe2x80x9c1,xe2x80x9d then the 12-bit approximation is neither incremented or decremented. If correction bit b12 is a xe2x80x9c0xe2x80x9d the 12-bit approximation will be incremented by the value of the 2-bit delta. In this method, however, because of the problems outlined above, the unused 13th bit is a large source of error that contributes to coarsely rendered scenes that are still undesirable.
A variation of the above discussed implementation is to use the full 13-bit address. The 13-bit address includes an implicit 1 for the first bit because of the limitation of D as being 0.5xe2x89xa6D less than 1.0, and 11 bits as an address to the ROM lookup table, with the last bit b13 as the correction bit. The ROM would have to provide 211 or 2048 entries of 14 bits, or 29 kbits. This is also undesirably large.
The present invention provides near-simultaneous error correction for memory-efficient approximations mathematical functions of multi-bit input data, such as may be representative of an image attribute. One embodiment of the present invention provides a circuit with a memory that receives a first portion of an input data signal via an address input and in response provides an output data signal corresponding to one of a plurality of memory locations. The output data signal includes an approximation of a mathematical function of the input data signal, at least one first stage delta signal, and at least one second stage delta signal. Random logic is coupled to receive the first stage delta signal and the second stage delta signal and receives a second portion of the input data signal on an input. Based on values of the second portion and the first and second stage delta signals, the random logic calculates an incrementing or decrementing signal. An incrementor, coupled to the random logic, increments or decrements the approximation of the input data signal according to the incrementing or decrementing signal.
In another embodiment, the present invention provides a circuit wherein the random logic provides an incrementing or decrementing signal having: (A) a value of the first stage delta signal when the first bit of the second portion of the input data signal is a xe2x80x9c0xe2x80x9d; (B) a value of (1) when the first bit and the second bit of the second portion of the input data signal is xe2x80x9c0,1xe2x80x9d and the first stage delta is xe2x80x9c1,0xe2x80x9d, and (C) a value of (xe2x88x921) when the first bit and the second bit of the second portion of the input data signal is xe2x80x9c1,1xe2x80x9d and the second stage delta is xe2x80x9c1xe2x80x9d.
In yet another embodiment, the present invention provides a method for memory efficient correction of processing digital representations of image attributes. The method includes dividing an input data signal into a first portion and a second portion. The first portion is converted into an output signal with a memory look-up table, where the output signal is one of a plurality of output signals corresponding to one of a plurality of memory locations addressable by the first portion. The method further includes the step of dividing the output signal into a digital approximation of a mathematical function of the first portion, a first stage delta signal, and a second stage delta signal. An incrementing or decrementing signal, based on a logical value of the second portion of the input data signal, the first stage delta signal, and the second stage delta signal, is calculated with random logic. The approximation signal is incremented or decremented with the incrementing or decrementing signal, respectively.
In still yet another embodiment, the invention provides an image processing computer system comprising a processor unit and a memory, coupled to the processor unit, for storing a plurality of instructions for execution in said processor unit. The instructions include instructions for dividing an input data signal into a first portion and a second portion. The instructions include converting the first portion into an output signal, the output signal being one of a plurality of output signals corresponding to one of a plurality of memory locations addressable by the first portion, and dividing the output signal into an approximation signal, a first stage delta signal, and a second stage delta signal. The instructions further instruct the processor unit to calculate, with random logic, an incrementing or decrementing signal based on a logical value of the second portion of the input data signal, the first stage delta signal, and the second stage delta signal, and increment or decrement the approximation signal with the incrementing or decrementing signal, respectively.