The invention relates to a motion picture coder and a system for controlling the same, and more particularly to an architecture of a motion picture coder for coding motion picture data based on a predetermined standardizing system. A motion picture coding system being capable of compressing a great amount of the motion picture data is desired to realize video signal processing systems such as video tele-conferencing systems and video telephones. The coding of the motion picture should be implemented according to the standardizing system recommended as the CCITT recommendation H.261.
In such standardizing system, a single picture frame of motion picture data is divided into a plurality of macro blocks. Each of the macro blocks includes 256-pixel of luminance signals and 128-pixel of chromimance signals so as to serve as image data.
Further, the implementation of the standardizing system of the CCITT recommendation H.261 forces each of the macro blocks to be subjected to following eight processes. First, a motion vector detection for each of the divided macro blocks is implemented. Second, a loop filtering for the each divided macro block is implemented. Third and fourth processes are an inter-frame difference for the each divided macro block and a discrete cosine transform (DCT) of the each divided macro block respectively. A quantization for each divided macro block is accomplished as a fifth process, after which an inverse quantization for each divided macro block is accomplished as a sixth process. Subsequently, an inverse discrete cosine transform (inverse DCT) for the each divided macro block is accomplished as a seventh process. Finally, an inter-frame addition for each divided macro block is accomplished as an eighth process. Thus, the implementation of such standardizing system requires the above complicated processes.
Generally, for the motion picture coder, there exists a great quantity of motion picture data to be processed by a video signal processor. Thus, the implementation of the motion picture coding requires the motion picture coder to process an extensive amount of arithmetic to accomplish the coding operation based on the above standardizing system. It is important that the motion picture coder as a processor is able to operate a large amount of arithmetic within a predetermined time. It is required for the realization of the above matter to keep each part or each unit of the motion picture coder as a processor from taking on an idle state. If any part or any unit of the motion picture coder as a processor takes on an idle state, the motion picture coder is no longer able to sufficiently exhibit an excellent potential ability. In the sequential architectures, a major part of a processor tends to take on an idle state at each arithmetic step. One of architectures to improve processing speed of a single processor is the multi-stage pipe-lined architecture. The pipe-line system divides a great amount of arithmetic-into a plurality of small and independent arithmetic blocks so that a plurality of the divided arithmetic blocks are concurrently operated on at each stage of the pipe-lined processor. Such pipe-lined system is applicable to the above motion picture coding. An example of the conventional motion picture coders for coding the motion picture data according to the above standardizing system is disclosed in IEEE, Journal of Solid-State Circuits, Vol. 24, No. 6, December 1989, pp-1662-1667. The conventional motion picture coder utilizes the pipe-line system to improve the processing speed.
The structure of the conventional motion picture coder will be described with reference to FIG. 1. FIG. 1 omits arithmetic units such as a shifter which do not pertain to the above coding processes according to the above standardizing system. The structure of the conventional motion picture coder is so pipe-lined as to have three pipe-lined stages. Namely, the conventional motion picture coder comprises first, second and third pipe-lined stages. The conventional motion picture coder comprises a plurality of arithmetic units which accomplish the above eight processes for coding the motion picture data based upon the above standardizing system.
The first pipe-lined stage of the motion picture coder includes an arithmetic logic unit 410 and a multiplier 420. The arithmetic logic unit 410 is connected at its input side to data input terminals 401 and 402 respectively. Input data such as motion picture data are transmitted to the arithmetic logic unit 410 through an input data terminals 401 and 402. The arithmetic logic unit 410 executes the addition between two input data inputted through the data input terminals 401 and 402. The arithmetic logic unit 410 also executes a subtraction between the two input data inputted through the data input terminals 401 and 402. The arithmetic logic unit 410 also executes an absolute value subtraction and a variety of logical operations of the two input data. The arithmetic logic unit 410 is connected at its output side to the input side of a selector 460. The result of the logic arithmetic provided by the arithmetic logic unit 410 is transmitted to the selector 460.
The multiplier 420 is connected at its input side, in parallel to the arithmetic logic unit 410, to the data input terminals 401 and 402 through selectors 450-1 and 450-2. The selectors 450-1 and 450-2 are connected at those input sides to the data input terminals 401 and 402 respectively. Each of the selectors 450-1 and 450-2 is also connected at its input side to the output side of the arithmetic logic unit 410. The selectors 450-1 and 450-2 are further connected at its output side to the input side of the multiplier 420. The multiplier 420 is connected at its output side to the input side of the selector 460. Input data such as the motion picture data are also transmitted from the input data terminals 401 and 402 to each of the selectors 450-1 and 450-2 respectively. The result of the logic arithmetic of the input data by the arithmetic logic unit 410 is also transmitted to each of the selectors 450-1 and 450-2. The selector 450-1 selects either the input data transmitted from the data input terminal 401 or the result of the arithmetic of the input data transmitted from the arithmetic logic unit 410. The selector 450-1 transmits the selected data to the multiplier 420. Similarly, the selector 450-2 selects either the input data transmitted from the data input terminal 402 or the result of the arithmetic of the input data transmitted from the arithmetic logic unit 410. The selector 450-2 transmits the selected data to the multiplier 420. The multiplier 420 executes the multiplication between the selected data transmitted from the selectors 450-1 and 450-2 respectively. The multiplier 420 transmits the result of the multiplication of the data to the selector 460. The selector 460 selects either the result of the logic arithmetic of the data transmitted from the arithmetic logic unit 410 or the result of the multiplication of the data transmitted from the multiplier 420.
The second pipe-lined stage of the motion picture coder includes an accumulator 430. The third pipe-lined stage of the motion picture coder includes a maximum and minimum value detector 440. The accumulator 430 existing on the second pipe-lined stage is connected at its input side to the output side of the selector 460. The accumulator 430 is connected at its output side to the input side of the maximum and minimum value detector 440 existing on the third pipe-lined stage of the motion picture coder. The output side of the accumulator 430 is further connected to the input side thereof, and thus both the input and output sides of the accumulator 430 are looped. The accumulator 430 is further connected at its output side to the input side of a selector 470. The selector 460 is also connected at its output side to the input side of the selector 470. The selector 460 transmits the selected data to the accumulator 430, and thus either the result of the logic arithmetic provided by the arithmetic logic unit 410 or the result of the multiplication provided by the multiplier 420 is selected by the selector 460 and transmitted to the accumulator 430. The previous result of the accumulation provided by the accumulator 430 is transmitted from the output side of the accumulator 430 to the input side of the accumulator 430. The accumulator 430 executes the accumulation of the selected data transmitted from the selector 460 and the previous result of the accumulation provided by the accumulator 430. The accumulator 430 transmits the result of the accumulation to the maximum and minimum value detector 440 existing on the third pipe-lined stage of the motion picture coder.
The maximum and minimum value detector 440 detects what has maximum and minimum values from the results of the accumulation transmitted from the accumulator 430. The maximum and minimum value detector 440 outputs the result of the detection for the maximum and minimum values and then transmits it to the selector 470.
The selector 470 fetches the result of the selection by the selector 460, and thus either the result of the logic arithmetic provided by the arithmetic logic unit 410 or the result of the multiplication provided by the multiplier 420. The selector 470 also fetches the result of the accumulation of the data from the accumulator 430. The selector 470 also fetches the result of the detection for the maximum and minimum values of the data from the maximum and minimum value detector 440. The selector 470 selects any one of the above fetched data. The selector 470 is further connected at its output side to an arithmetic result output terminal 403. The selector 470 transmits the selected data to the arithmetic result output terminal 403. Thus, any one of the fetched data by the selector 470 is outputted as the arithmetic result of the motion picture coder through the arithmetic result output terminal 403. Namely, the arithmetic result of the motion picture coder falls into any one of the three results, for instance, the result of the maximum and minimum values detection provided by the maximum and minimum value detector 440, the result of the accumulation provided by the accumulator 430 and either the result of the logic arithmetic provided by the arithmetic logic unit 410 or the result of the multiplication provided by the multiplier 420.
The operation of the conventional motion picture coder will subsequently be described in detail with reference to FIGS. 2 and 3. FIG. 2 indicates each of the processes of the motion picture coding based on the above predetermined standardizing system. FIG. 2 also indicates the arithmetic units to be used on the each process of the motion picture coding based on the above predetermined standardizing system.
As described above, the motion picture coding according to the above predetermined standardizing system requires the following eight processes. Namely, the first process is the motion vector detection. Second process is the loop filtering. Third process is the inter-frame difference. Fourth process is the discrete cosine transform (DCT). Fifth process is the quantization. Sixth process is the inverse quantization. Seventh process is the inverse discrete cosine transform (inverse DCT). Final process is the inter-frame addition.
Subsequently, the above processes of the motion picture coding based on the predetermined standardizing system will be described in detail with reference to FIGS. 1 and 2. The motion vector detection as a first process of the motion picture coding is performed by the following steps. Image data are inputted through the data input terminals 401 and 402 into the arithmetic logic unit 410 existing on the first pipe-lined stage. The arithmetic logic unit 410 executes the absolute value subtraction between the image data inputted from the data input terminals 401 and 402 respectively. After that, the arithmetic logic unit 410 transmits the result of the absolute value subtraction between the input image data to the selector 460. The selector 460 selects the result of the absolute value subtraction between the input image data. Then, the selector 460 transmits the result of the absolute value subtraction to the accumulator 430 existing on the second pipe-lined stage. The accumulator 430 accumulates the result of the absolute value subtraction on the previous result of the accumulation executed by itself. The accumulator 430 subsequently transmits the result of the accumulation to the maximum and minimum value detector 440 which exists on the third pipe-lined stage. The maximum and minimum value detector 440 fetches the result of the accumulation provided by the accumulator 430 and executes the minimum value detection, and thus detects what has a minimum value from the fetched result of the accumulation. The maximum and minimum value detector 440 transmits the result of the maximum and minimum value detection to the selector 470. The selector 470 selects the result of the maximum and minimum value detection transmitted from the maximum and minimum value detector 440, and then transmits the result of the maximum and minimum value detection to the arithmetic result output terminal 403. Namely, the result of the maximum and minimum value detection is outputted through the arithmetic result output terminal 403 as an arithmetic result of the process of the motion vector detection. Therefore, the accomplishment of the motion vector detection process requires the above three units to be operated, and thus the operations of the arithmetic logic unit 410, the accumulator 430 and the maximum and minimum value detector 440.
The loop filtering as a second process of the motion picture coding is performed by the following steps. Image data is transmitted through the data input terminal 401' to the selector 450-1. Filter coefficient data is transmitted through the data input terminal 402 to the selector 450-2. The selector 450-1 selects the image data inputted through the data input terminal 401. The selector 450-1 transmits the image data to the multiplier 420 which exists on the first pipe-lined stage. The selector 450-2 also selects the filter coefficient data transmitted through the data input terminal 402. The selector 450-2 transmits the filter coefficient data to the multiplier 420. The multiplier 420 executes the multiplication of the image data and the filter coefficient data. The multiplier 420 transmits the result of the multiplication of the image data and the filter coefficient data to the selector 460. The selector 460 selects the result of the multiplication of the image data and the filter coefficient data and transmits it to the accumulator 430 which exists on the second pipe-lined stage. The accumulator 430 accumulates the result of the multiplication of the image data and the filter coefficient data on the previous result of the accumulation executed by itself. The accumulator 430 transmits the result of the accumulation to the selector 470. The selector 470 selects the result of the accumulation executed by the accumulator 30 and transmits it to the arithmetic result output terminal 403. Namely, the result of the accumulation is outputted through the arithmetic result output terminal 403 as an arithmetic result of the process of the loop filtering, thereby permitting the digital filtering to be realized. Therefore, the accomplishment of the loop filtering process requires the above two units to be operated, and thus the operations of the multiplier 420 and the accumulator 430.
The inter-frame difference as a third process of the motion picture coding is performed by the following steps. Image data are inputted through the data input terminals 401 and 402 to the arithmetic logic unit 410 existing on the first pipe-lined stage. The arithmetic logic unit 410 executes the subtraction between the image data, both of which are transmitted through the data input terminals 401 and 402. The arithmetic logic unit 410 transmits the result of the subtraction between the image data to the selector 460. The selector 460 selects the result of the subtraction between the image data provided by the arithmetic logic unit 410 and transmits it to the selector 470. The selector 470 selects the result of the subtraction executed by the arithmetic logic unit 410 and transmits it to the arithmetic result output terminal 403. Namely, the result of the subtraction is outputted through the arithmetic result output terminal 403 as an arithmetic result of the process of the inter-frame difference. Therefore, the accomplishment of the inter-frame difference process requires the above single unit to be operated, and thus the subtraction operation of the arithmetic logic unit 410.
The discrete cosine transform (DCT) as a fourth process of the motion picture coding is performed by the following steps. Image data is transmitted through the data input terminal 401 to the selector 450-1. Discrete cosine transform coefficient (DCT coefficient) data is transmitted through the data input terminal 402 to the selector 450-2. The selector 450-1 selects the image data inputted through the data input terminal 401. The selector 450-1 transmits the image data to the multiplier 420 which exists on the first pipe-lined stage. The selector 450-2 also selects the discrete cosine transform coefficient (DCT coefficient) data transmitted through the data input terminal 402. The selector 450-2 transmits the discrete cosine transform coefficient (DCT coefficient) data to the multiplier 420. The multiplier 420 executes the multiplication of the image data and the discrete cosine transform coefficient (DCT coefficient) data. The multiplier 420 transmits the result of the multiplication of the image data and the discrete cosine transform coefficient (DCT coefficient) data to the selector 460. The selector 460 selects the result of the multiplication of the image data and the discrete cosine transform coefficient (DCT coefficient) data and transmits it to the accumulator 430 which exists on the second, pipe-lined stage. The accumulator 430 accumulates the result of the multiplication of the image data and the discrete cosine transform coefficient (DCT coefficient) data on the previous result of the accumulation executed by itself. The accumulator 430 transmits the result of the accumulation to the selector 470. The selector 470 selects the result of the accumulation executed by the accumulator 430 and transmits it to the arithmetic result output terminal 403. Namely, the result of the accumulation is outputted through the arithmetic result output terminal 403 as an arithmetic result of the process of the discrete cosine transform. Therefore, the accomplishment of the discrete cosine transform process requires the above two units to be operated, and thus the operations of the multiplier 420 and the accumulator 430.
The quantization as a fifth process of the motion picture coding is performed by the following steps. The arithmetic result of the discrete cosine transform (DCT) process is inputted through the data input terminal 401 to the selector 450-1. The selector 450-1 selects the arithmetic result of the discrete cosine transform (DCT) process and transmits it to the multiplier 420 existing on the first pipe-lined stage. The reciprocal of the quantization coefficient is inputted through the data input terminal 402 to the selector 450-2. The selector 450-2 selects the reciprocal of the quantization coefficient and transmits it to the multiplier 420. The multiplier 420 executes the multiplication between the arithmetic result of the discrete cosine transform (DCT) and the reciprocal of the quantization coefficient, both of which are transmitted through the data input terminals 401 and 402 respectively. The multiplier 420 transmits the result of the multiplication between the arithmetic result of the discrete cosine transform (DCT) and the reciprocal of the quantization coefficient to the selector 460. The selector 460 selects the result of the multiplication between the arithmetic result of the discrete cosine transform (DCT) and the reciprocal of the quantization coefficient, and then transmits it to the selector 470. The selector 470 selects the result of the multiplication executed by the multiplier 420 and transmits it to the arithmetic result output terminal 403. Namely, the result of the multiplication is outputted through the arithmetic result output terminal 403 as an arithmetic result of the process of the quantization. Therefore, the accomplishment of the quantization process requires the above single unit to be operated, and thus the multiplication operation of the multiplier 420.
The inverse quantization as a sixth process of the motion picture coding is performed by the following steps. The arithmetic result of the quantization process is transmitted through the data input terminal 401 to the selector 450-1. Quantization coefficient data is transmitted through the data input terminal 402 to the selector 450-2. The selector 450-1 selects the arithmetic result of the quantization inputted through the data input terminal 401. The selector 450-1 transmits the arithmetic result of the quantization to the multiplier 420 which exists on the first pipe-lined stage. The selector 450-2 also selects the quantization coefficient data transmitted through the data input terminal 402. The selector 450-2 transmits the quantization coefficient data to the multiplier 420. The multiplier 420 executes the multiplication of the arithmetic result of the quantization and the quantization coefficient data. The multiplier 420 transmits the result of the multiplication of the arithmetic result of the quantization and the quantization coefficient data to the selector 460. The selector 460 selects the result of the multiplication of the arithmetic result of the quantization and the quantization coefficient data and transmits it to the accumulator 430 which exists on the second pipe-lined stage. The accumulator 430 accumulates the result of the multiplication of the arithmetic result of the quantization and the quantization coefficient data on the previous result of the accumulation executed by itself. The accumulator 430 transmits the result of the accumulation to the selector 470. The selector 470 selects the result of the accumulation executed by the accumulator 430 and transmits it to the arithmetic result output terminal 403. Namely, the result of the accumulation is outputted through the arithmetic result output terminal 403 as an arithmetic result of the process of the inverse quantization. Therefore, the accomplishment of the inverse quantization process requires the above two units to be operated, and thus the operations of the multiplier 420 and the accumulator 430.
The inverse discrete cosine transform (inverse DCT) as a seventh process of the motion picture coding is performed by the following steps. The arithmetic result of the inverse quantization process is transmitted through the data input terminal 401 to the selector 450-1. Inverse discrete cosine transform (inverse DCT) coefficient data is transmitted through the data input terminal 402 to the selector 450-2. The selector 450-1 selects the arithmetic result of the inverse quantization inputted through the data input terminal 401. The selector 450-1 transmits the arithmetic result of the inverse quantization to the multiplier 420 which exists on the first pipe-lined stage. The selector 450-2 also selects the inverse discrete cosine transform (inverse DCT) coefficient data transmitted through the data input terminal 402. The selector 450-2 transmits the inverse discrete cosine transform (inverse DCT) coefficient data to the multiplier 420. The multiplier 420 executes the multiplication of the arithmetic result of the inverse quantization and the inverse discrete cosine transform (inverse DCT) coefficient data. The multiplier 420 transmits the result of the multiplication of the arithmetic result of the inverse quantization and the inverse discrete cosine transform (inverse DCT) coefficient data to the selector 460. The selector 460 selects the result of the multiplication of the arithmetic result of the inverse quantization and the inverse discrete cosine transform (inverse DCT) coefficient data and transmits it to the accumulator 430 which exists on the second pipe-lined stage. The accumulator 430 accumulates the result of the multiplication of the arithmetic result of the inverse quantization and the inverse discrete cosine transform (inverse DCT) coefficient data on the previous result of the accumulation executed by itself. The accumulator 430 transmits the result of the accumulation to the selector 470. The selector 470 selects the result of the accumulation executed by the accumulator 430 and transmits it to the arithmetic result output terminal 403. Namely, the result of the accumulation is outputted through the arithmetic result output terminal 403 as an arithmetic result of the process of the inverse discrete cosine transform (inverse DCT). Therefore, the accomplishment of the inverse discrete cosine transform (inverse DCT) process requires the above two units to be operated, and thus the operations of the multiplier 420 and the accumulator 430. The inter-frame addition as a final process of the motion picture coding is performed by the following steps. The arithmetic result of the inverse discrete cosine transform (inverse DCT) process is inputted through the data input terminal 401 to the arithmetic logic unit 410 existing on the first pipe-lined stage. Image data is inputted through the data input terminal 402 to the arithmetic logic unit 410. The arithmetic logic unit 410 executes the addition between the image data and the arithmetic result of the inverse discrete cosine transform (inverse DCT), both of which are transmitted through the data input terminals 402 and 401 respectively. The arithmetic logic unit 410 transmits the result of the addition between the image data and the arithmetic result of the inverse discrete cosine transform (inverse DCT) to the selector 460. The selector 460 selects the result of the addition provided by the arithmetic logic unit 410 and transmits it to the selector 470. The selector 470 selects the result of the addition executed by the arithmetic logic unit 410 and transmits it to the arithmetic result output terminal 403. Namely, the result of the addition is outputted through the arithmetic result output terminal 403 as an arithmetic result of the process of the inter-frame addition. Therefore, the accomplishment of the inter-frame addition process requires the above single unit to be operated, and thus the addition operation of the arithmetic logic unit 410.
FIG. 3 is a timing chart which indicates the sequence of the above motion picture coding processes and the number of steps required to accomplish each of the above processes for each macro block. A step means a performance time of one pipe-lined stage of the motion picture coder. The number of steps of each motion picture coding process for a single macro block will be described. The motion vector detection as a first motion picture coding process requires 2880 steps. The loop filtering as a second motion picture coding process requires 3114 steps. The inter-frame difference as a third motion picture coding process requires 384 steps. The discrete cosine transform (DCT) as a fourth motion picture coding process requires 6144 steps. The quantization as a fifth motion picture coding process requires 384 steps. The inverse quentization as a sixth motion picture coding process requires 786 steps. The inverse discrete cosine transform (inverse DCT) as a seventh motion picture coding process requires 6144 steps. The inter-frame addition as a final motion picture coding process requires 384 steps. The inverse quentization as a sixth motion picture coding process requires 786 steps. The above eight processes for the motion picture coding are sequentially performed. Thus, the number of the total steps of the above motion picture coding processes for one macro block is 20220 steps.
The conventional architecture of the motion picture coder based upon the standardizing system will be investigated. In the prior art, the above processes of the motion picture coding for a single macro block are sequentially repeated until motion picture coding processes for all macro blocks are finished. Namely, each macro block is sequentially processed by the motion picture coder.
The conventional motion picture coder involves, on its first pipe-lined stage, the arithmetic logic unit 410 and the multiplier 420, both of which are connected, in parallel to one another, to the second pipe-lined stage of the motion picture coder. The selector 460 selects either the operation of the arithmetic logic unit 410 or the operation of the multiplier 420. This means that at least one of the arithmetic logic unit 410 and the multiplier 420 takes an idle state on each the motion picture coding process. As described above, when there exists an idle state of a part of the arithmetic units involved in the motion picture coder, the motion picture coder is unable to exhibit its maximum potential ability at the processing speed of the motion picture coding process. Although the motion picture coder utilizes the pipe-line system to improve processing speed, the motion picture coder suppresses the advantage in a high speed processing possessed by the pipe-line system to be sufficiently exhibited.
Such considerable disadvantage in suppressing the processing speed to be improved is caused by its architecture. The conventional architecture arranges a plurality of arithmetic units, for example, the arithmetic logic unit 410 and the multiplier 420 on a single pipe-lined stage, and thus the first pipe-lined stage. Thus, it is impossible that the arithmetic logic unit 410 and the multiplier 420 are concurrently operated. Such architecture forces at least any one of the arithmetic logic unit 410 or the multiplier 420 to always take on an idle state. This suppresses the advantage in a high speed processing possessed by the pile-line system to be exhibited.
To combat such problem, it is required to provide a novel architecture of a pipe-lined motion picture coder which is able to perform each of the motion picture coding processes without idling of arithmetic units involved in the motion picture coder. Thus, it is desirable to provide a novel architecture of a pipe-lined motion picture coder which permits improving its processing speed of the motion picture coding for all of plural macro blocks, and thus permits pipe-lined arithmetic units to exhibit those maximum potential ability at the processing speed of the motion picture coding for plural macro blocks.