1. Field of the Invention
The invention relates to computers and more specifically to the processing of multimedia data parameters.
2. Description of Related Art
Computer multimedia applications typically involve the processing of high volumes of small data values representing audio signals and video images. Many times processing the data includes performing transform coding which is a method of converting the data values into a series of transform coefficients for more efficient transmission, computation, encoding, compression, or other processing algorithms.
More specifically, the data values often represent a signal as a function, for example, of time. The transform coefficients represent the same signal as a function, for example, of frequency. There are numerous transform algorithms, including, for example, the fast fourier transform (FFT), the discrete cosine transform (DCT), and the Z transform. Corresponding inverse transform algorithms, such as an inverse DCT, convert transform coefficients to sample data values. Many of these algorithms include multiple mathematical steps. A DCT, for example, includes a step identical to a two dimensional rotation of the data values.
The steps of the rotation involve assuming two separate data values represent X and Y positions, respectively, on a two-dimensional graph. The two dimensional rotation is performed by modifying the values to represent new positions on the X and Y axis.
The equations typically used by the DCT to rotate the positions are illustrated in table one below. In the equations, the coefficient C represents a scaling factor to determine a distance of the movement.
TABLE 1 ______________________________________ X' = C .multidot. ((Cos.theta. .multidot. X) -/+ (Sin.theta. .multidot. Y)) Y' = C .multidot. ((Sin.theta. .multidot. X) -/+ (Cos.theta. .multidot. Y)) ______________________________________
A prior method implementing the above equations by computing the two dimensional rotation of the data values on conventional computers, involves the slow and inefficient method of using at least three instructions to generate one resulting value representing a modified X or Y component. For example, one or more instructions are used to generate the product of a first data value and a cosine or sine function, the same for a second product, and a third instruction to generate the sum or difference of the first and second product.
However, recent advancements in more modern computers provide a system that is able to process the small data values more efficiently. More specifically, multiple data elements are joined together as packed data sequences. The packed data sequences enable the transfer of up to sixty-four bits of integer data. As a result, in addition to the conventional thirty-two bit integer register file, a multimedia (MM) register file is provided to take advantage of the packed data sequences. The MM register file typically has extended registers providing storage for sixty-four data bits.
With the availability of the packed data and the MM registers, a second method is available to generate the two dimensional rotation of the data values. The second method, however, uses additional instruction per final result to generating a more accurate result. For example, as previously stated, the DCT is typically used to encode the data values possibly representing pixels of a video image. Thereafter, the encoded data is processed to regenerate the original image. However, when regenerating an image that is not frequently refreshed (i.e., stationary images), there is a higher demand placed on the accuracy of regenerating the image. In such cases, cumulative errors are very noticeable. On the other hand, when regenerating an image for motion video, such as MPEG, the referenced image is typically refreshed more frequently. Therefore, the accumulated errors in the encoded pixel data are not as noticeable, and as a result, there is not as great of a need for accuracy, thereby allowing more effort to be place on increasing the speed of processing the data.
Even though the second method was developed to provide greater accuracy, it is nevertheless discussed below to provide a background for further illustrating the need for a faster method of generating the two dimensional rotation of packed data, and therefore the advantages and novelty of the method of the present invention.
As illustrated in Table 2 below, the second method, using packed data, typically involves using at least four to five instructions to generate two resulting values. More specifically, a first instruction assumes two adjacent data values in memory represent X and Y components, respectively. The two data values are loaded into a first register in the non-planar format as packed words, each filling a sixteen bit element of the four sixteen bit elements available in the MM register. (Table 2a). A second instruction copies the data to take advantage of the unused data space in the register. (Table 2b). A third instruction performs two micro-operations. In the third instruction, a second packed data is provided as a memory operand containing either sine or cosine functions (Table 2c). The elements of the first packed data are multiplied with the corresponding elements of the packed data memory operand, thereby providing a set of intermediate results in packed data (Table 2d). Next, in the set of intermediate results packed data, adjacent elements are added. (Table 2e). As a result, two resultant values are provided in thirty-two bit formats, which is a larger format than desired for subsequent processing steps in typical transform algorithms. Thus, a fourth instruction is used to perform a right addition shift to truncate the resulting values to sixteen bit values. (Table 2e). Next, a fifth instruction can be used to copy the data values into the adjacent positions so as to be re-stored in memory as the modified X and Y components representing the two-dimensional rotation. (Table 2f).
TABLE 2 ______________________________________ 1 #STR1## ______________________________________
The second method, however, inefficiently includes instructions that duplicate the data, truncate and normalize the resultant data into the desired lengths, and furthermore, copy the data values back into the original adjacent positions. These extra instructions are costly time consuming instructions that impede the optimum processing speed of generating the two dimensional rotation of the data by using four to five instructions to generate only two results.
In the cases where accuracy is not a highest concern, what is needed is a faster and more efficient method that is able to generate a greater number of resulting values representing the two dimensional rotation of the data values through the use of fewer instructions.