1. Field of the Invention
The present invention relates to a processor of the stored program system and more particularly to a constitution favorably applied to processing of multimedia data including moving picture data and others.
2. Description of the Related Art
As a system for compressing moving picture data, there is a system known as ISO/MPEG. In this system, a portion where images are mutually approximated between a plurality of frames constituting a moving picture is searched so as to compensate for the motion. By encoding a change of the position of the portion where images are mutually approximated as a motion vector indicating the motion of the moving picture, the moving picture data is compressed.
This search is executed by retrieving a portion of the search window in the reference frame 81 shown in FIG. 10 to which an image in the region of 16 pixels by 16 pixels called a current macroblock in the current frame 80 shown in FIG. 10 is approximated most.
For evaluation of approximation between images, the evaluation formula indicated by Formula 1 is widely used. ##EQU1##
When this evaluation formula is used, the aforementioned search is executed by searching a combination of (u, v) at which the value of this evaluation formula is minimized.
As a conventional processor for calculating the evaluation formula of Formula 1 at high speed, there exists Ultra SPARC of SUN Microsystems described on page 16 of "MICROPROCESSOR REPORT, DEC. 5, 1994". ##EQU2##
An outline of the constitution of this processor is shown in FIG. 11.
As shown in the drawing, this processor comprises a load store processor 9130, a pixel calculation processor 9133 for executing the calculation shown in Formula 2, a plurality of calculation processors 9131 and 9132 for executing the other calculations, a register file including a plurality of registers of 64-bit width, instruction registers 30 to 33 installed in correspondence with each processor, an instruction supplying unit 912 for supplying instructions to the instruction registers 30 to 33, and a system bus interface for controlling input and output with the system bus to which the main storage storing an instruction string is connected.
The pixel calculation processor 9133 handles data read from the register file as a set of 8 8-bit data as shown in FIG. 12. The pixel calculation processor 9133 executes the calculation shown in Formula 2 for 2 sets of 8 8-bit data read from the register file.
In this processor, the process for finding a combination of (u, v) for minimizing Formula 1 using a calculation instruction for permitting the pixel calculation processor 9133 to execute the calculation of Formula 2 is realized according to the procedure shown in FIG. 13.
Namely, Step 404 shown in FIG. 13 is executed repeatedly for each v of 0 to 15 by the functions of Steps 402, 409, and 410. Step 407 is also executed for each v of 0 to 15. Furthermore, it is executed repeatedly for each j value between 1 to 15 by the functions of Steps 403, 405, and 406 for each v value.
Next, as explained using r (j, u, v) defined by Formula 3, at Steps 404 and 407 shown in FIG. 13, r (j, u, v) is obtained for each combination (j, u, v) of j between 0 and 15 and v between 0 and 15 for each u value between 0 and 15 during the aforementioned repetitive process. At Step 404, r (j, u, v) is obtained for j=0 and at Step 407, r (j, u, v) is obtained for j between 1 and 15. ##EQU3##
At these steps, during the aforementioned processing, the sum of r (j, u, v) obtained for each j between 0 and 15 for the same combination of (u, v) is obtained as Ruv. This is realized by fixing v, obtaining r (0, u, v) for j=0 for each u between 0 and 15 at Step 404, obtaining r (j, u, v) for each j between 1 and 15 for each u between 0 and 15 at Step 407, adding it to a parameter Ruv provided for each combination of u and v, and executing the process for each v between 0 and 15.
Only an Ruv obtained which is smaller than the Ruv obtained previously is left (Step 408) and the combination of (u, v) corresponding to the Ruv left last is taken as (u, v) for minimizing Formula 1. In this case, to calculate R (j, u, v) for a specific combination of (j, u, v), it is necessary to execute the calculation indicated by Formula 2 two times for each i between 0 and 7 and each i between 8 and 15 respectively. As a result, in this processor, it is necessary to execute the calculation indicated by Formula 2, data reading as a preprocess for this calculation, and generation of data to be used for calculation enormous times.
Needless to say, by narrowing the aforementioned search window, it is possible to decrease the number of calculations and speed up the processing. However, by doing this, the compression efficiency decreases and the quality of moving pictures lowers.
Therefore, it can be considered to speed up the processing by expanding the calculation indicated by Formula 1 so as to execute by one instruction.
However, for that purpose, it is necessary to increase the input bit width of the pixel calculation processor 9133 and also increase the bit width of the register file so as to handle a lot of pixel data at the same time. By doing this, the scale of the register file increases. Furthermore, the calculation processors 9131 to 9133 other than the pixel calculation processor 9133 do not require data with such a bit width, so that it is not considered as an efficient method as a whole.
By increasing the number of data which can be read from the register file and permitting a plurality of pixel calculation processors to execute the calculation of Formula 2 in parallel, it is considered to speed up the processing. However, even in such a case, the scale of the hardware increases remarkably and the cost of the hardware increases extremely.