The present invention relates to a data processing system and a data processing method adopted in the data processing system, and particularly to a data processing system and a data processing method adopted in the data processing system which are capable of keeping up with a case in which the piece count of pixel data to be processed increases by delaying the operating timing of input and output units employed in the data processing system.
As a related art data processing system, there is known a processor called an SVP (Serial Video Processor) described in Section 3.1 on Page 17 of the IEEE 1990 Custom Integrated Circuits Conference. Composed of 1,024 processors integrated in a single chip, the SVP is a processor for carrying out real time digital processing on a video signal. The SVP has an SIMD (Single Instruction stream/Multiple Data stream) structure which allows pixel data on a horizontal scanning line to be processed concurrently. SIMD is the name of one of data processing methods adopted by a computer whereby data of different kinds is processed concurrently as if the data pertained to one job.
FIG. 1 is a block diagram showing a typical configuration of an SIMD control parallel processor. As shown in the figure, the SIMD control parallel processor includes a program control apparatus 17, an input SAM (Serial Access Memory) unit 11, a data memory unit 12, a processing circuit unit 13 and an output SAM unit 14.
The input SAM unit 11, the data memory unit 12, the processing circuit unit 13 and the output SAM unit 14 constitute a group of parallel processor elements 15 arranged in a linear array. The processor elements 15 are controlled in a state being interlocked with each other in accordance with a program of the program control apparatus 17, that is, subjected to the SIMD control. The program control apparatus 17 includes a program memory for storing the program in advance and a sequence control circuit for carrying on the program. The program control apparatus 17 generates a variety of control signals in accordance with the program in order to control a variety of circuits.
It should be noted that the input SAM unit 11, the data memory unit 12, and the output SAM unit 14 are each implemented as a memory, detailed explanation of which is omitted. In an apparatus shown in FIG. 1, row address decoders for these memories are included in the program control apparatus 17.
One processor element 15 is represented by a hatched block in FIG. 1. A plurality of processor elements 15 are arranged in parallel, that is, in the horizontal direction of the figure. That is to say, the processor element 15 indicated by the hatch block includes components of one processor.
Next, the operation of the linear array parallel processor for carrying out video processing shown in FIG. 1 will be described.
Input data, strictly speaking, video data of one pixel, is supplied to the input SAM unit 11 in accordance with a control signal output by the program control apparatus 17. The processor elements 15 from the leftmost one to the rightmost one shown in the figure sequentially process the data. That is to say, pieces of input data are supplied sequentially to input SAM cells of the input SAM unit 11 from the leftmost one to the rightmost one shown in the figure.
Since the number of the processor elements 15 is at least equal to the pixel count H in one horizontal scanning period of a video signal, pixel data of one horizontal scanning period of a video signal can be accommodated in the input SAM unit 11. The operation to supply input data is repeated for each horizontal scanning period.
Each time data of one horizontal scanning period of a video signal is accumulated in the input SAM unit 11 as described above, the program control apparatus 17 carries out processing by executing SIMD control on the input SAM unit 11, the data memory unit 12, the processing circuit unit 13 and the output SAM unit 14 in accordance with the program as described below. In addition, the SIMD control causes the following operations to be executed in all the processor elements 15 concurrently in the same way.
The input data of one horizontal scanning period of a video signal accumulated in the input SAM unit 11 is, if necessary, transferred from the input SAM unit 11 to the data memory unit 12 during the next horizontal scanning fly-back line period to be used in the subsequent processing.
In a transfer of data from the input SAM unit 11 to the data memory unit 12, the program control apparatus 17 makes an access to data of a predetermined bit count in the input SAM unit 11 selected by an input SAM read signal, and then outputs a memory access signal to write the data into a predetermined memory cell of the data memory unit 12.
Next, the program control apparatus 17 supplies data stored in the data memory unit 12 of each processor element 15 to the processing circuit unit 13 of the processor element 15 in accordance with the program and lets the processing circuit unit 13 carry out arithmetic and logic processing on the data supplied thereto. Results of processing are then written at a predetermined address of the data memory unit 12.
FIG. 2 is a block diagram showing a typical configuration of the processing circuit unit 13. Pieces of data from the data memory unit 12 are supplied to a register 84 by way of a selector 80, a register 85 by way of a selector 81 and a register 86 by way of a selector 82. The selector 80 selects the value 1 set in advance, the piece of data output by the data memory unit 12 or data stored in the register 84 and outputs the selected one to the register 84. The selector 80 selects one of the three inputs in accordance with a signal generated by the program control apparatus 17. A register 87 is used for storing data representing a carry-over generated by a full adder 91.
A logical product circuit 88 computes a logical product of the data stored in the register 84 and data stored in the register 85. An exclusive logical sum circuit 89 computes an exclusive logical sum of data output by the logical product circuit 88 and data supplied by the program control apparatus 17 and supplies the exclusive logical sum to the full adder 91. The full adder 91 also receives data stored in the register 86 and data stored in a register 87. The full adder 91 computes the sum of these three inputs, outputting the sum and its carry-over to a selector 92. The carry-over is also supplied to the register 87 by way of the selector 83.
A selector 90 selects either the data output by the register 85 or data output by the register 86 and outputs the selected one to the selector 92. The selector 92 selects one of three inputs thereof, that is, the data output by the selector 90, the sum output by the full adder 91 or the carry-over also output by the full adder 91, and outputs the selected one to the data memory unit 12. Signals generated by the program control apparatus 17 control how the selectors 90 and 92 select one of their inputs.
Assume that, for example, a signal generated by the program control apparatus 17 controls the selector 80 to let the selector 80 select the value 1 to be stored in the register 84. In this case, since the logic value 1 is stored in the register 84, data stored in the register 85 from the data memory unit 12 passes through the logical product circuit 88 as it is, entering the full adder 91 by way of the exclusive sum circuit 89. The full adder 91 computes the sum of the data supplied from the register 85 by way of the exclusive logical circuit 89, data stored in the register 86 from the data memory unit 12 and data representing a carry-over generated in previous processing and stored in the register 87. The sum and a newly generated carry-over are output to the selector 92. The carry-over is supplied to the register 87 through to be stored therein by way of the selector 83.
The program control apparatus 17 is also capable of controlling the selector 92 to select the carry-over generated by the full adder 91 to be output to the data memory unit 12. In addition, the program control apparatus 17 is also capable of controlling the selector 90 to select either the data output by the register 85 or data output by the register 86 to be output to data memory unit 12 by way of the selector 92 which is also controlled thereby to select the data selected by the selector 90.
When it is desired to supply data output by the logical product circuit 88 to the full adder 91 by logically inverting the data, the program control apparatus 17 outputs the logic value 1 to the exclusive sum circuit 89 as one of the inputs thereof. With the logic value 1 supplied to the exclusive sum circuit 89 as one of the inputs thereof, the exclusive sum circuit 89 will pass on a logic value 1 received from the logical product circuit 88 as a logic value 0 and pass on a logic value 0 received from the logical product circuit 88 as a logic value 1.
When it is desired to compute a logical product of newly input data and immediately previous data, the program control apparatus 17 controls the selector 80 to again select data stored in the register 84. With the selector 80 again selecting the data stored in the register 84, the logical product circuit 88 receives the current data and the immediately previous data and computes their logical product because the current data is stored in the register 85. By controlling the selector 80 to select the output of the register 84 repeatedly, processing can be carried out on new input data and previous input data.
When processing allocated to a one horizontal scanning period as described above is finished, data processed in the one horizontal scanning period is transferred to the output SAM unit 14 by the end of the one horizontal scanning period.
As described above, transfers of input data stored in the input SAM unit 11 to the data memory unit 12, processing of the data carried out by the processing circuit unit 13 and transfers of processing results to the output SAM unit 14 during the one horizontal scanning period are executed in accordance with the SIMD control program in bit units. These pieces of arithmetic/logic processing are carried out repeatedly with one horizontal scanning period of the video signal taken as a unit.
The data transferred to the output SAM unit 14 is further output from the output SAM unit 14 in the next horizontal scanning period.
As described above, three pieces of processing are carried out on each piece of input data. The three pieces of processing are the input processing to write input data into the input SAM unit 11, the arithmetic/logic processing controlled by the program control apparatus 17 and the output processing to output results of processing from the output SAM unit 14. The arithmetic/logic processing controlled by the program control apparatus 17 includes transfers of input data stored in the input SAM unit 11 to the data memory unit 12, processing of the data carried out by the processing circuit unit 13 and transfers of processing results to the output SAM unit 14. It should be noted that the three pieces of processing are executed as pipeline processing with one horizontal scanning period of the video signal taken as a unit.
Pay attention to data input in one horizontal scanning period. Typically, it takes as much time as about one horizontal scanning period to complete each of the three pieces of processing. Thus, in order to complete the three pieces of processing for the data, it takes as much time as about three horizontal scanning periods. Since the three pieces of processing are pipeline processing which is carried out concurrently, that is, processing wherein the 2nd piece of processing for data of the current horizontal scanning period is carried out concurrently with the 1st piece of processing for data of the following horizontal scanning period, however, it takes as much time as only about one horizontal scanning period to complete the three pieces of processing for data of one horizontal scanning period on the average.
In the related art data processing apparatus, pixels of one horizontal scanning period of a video signal are distributed among processor elements each for processing pixel data. However, there are a variety of formats for several hundreds to several thousands of pixels included in one horizontal scanning line of a video signal. Therefore, a data processing apparatus has to include a sufficient number of processor elements for handling a possible maximum piece count of pixel data. When such a data processing apparatus handles a video signal with few pixels, however, there is raised a problem that consumed electric power is much wasted.
In addition, when the number of processor elements is too small for handing pixels included in one horizontal scanning line, the horizontal scanning line is split and distributed among a plurality of data processor elements. In this case, however, a processor element allocated to data on one side of a pixel split boundary may have to exchange data with a processor element in another data processing apparatus allocated to data on the other side of the boundary.
If the data processing apparatus is implemented as a semiconductor chip, there is raised a problem that such exchanges of data lead to a reduced processing speed.