1. Field of the Invention
The present invention relates to a digital signal processor capable of performing an arithmetic processing of mainly a signal series.
2. Description of the Prior Art
FIG. 1 is a schematic block diagram of an arrangement of a first conventional digital signal processor which has been described in "A 50nS FLOATING-POINT SIGNAL PROCESSOR VLSI", P.401, ICASSP 86, 1986. It should be noted that for the sake of simplicity, only required blocks are illustrated in FIG. 1.
In FIG. 1, reference numeral 1 indicates an instruction memory for storing an instruction word; 2 denotes a program counter for outputting an address of the instruction memory 1 to an output path 51; 3 represents an instruction execution control unit for decoding the instruction word supplied from the instruction memory 1 via an output path 52, and for outputting a control signal via an output path 53 to the program counter 2, a calculation unit or the like; 4 is an internal data memory for storing calculation data; 5 represents a data bus for transferring data read out from the internal data memory 4 via the output path 54; 6a denotes a multiplier unit for performing multiplication on input data supplied from the data bus 5 via an output path 55; 7 indicates an accumulator for performing an accumulating operation; 8 represents an accumulating register for holding an accumulation result; and reference numeral 9 indicates a repeat counter for repeating the same instruction at plural times.
Furthermore, reference numeral 63 indicates an input/output path for connecting the repeat counter 9 and the data bus 5; 64 represents a selector for inputting the data which has been supplied via the output path 56 from the multiplier unit 6a, and the data which has been supplied from the data bus 5 via the output path 57 thereinto and for supplying output data via the output pth 58 to the accumulator 7; 65 denotes a selector for inputting the output data which has been supplied from the data bus 5 and the output data which has been supplied from the accumulating register 8 therein, and for supplying the output data via the output path 61 to the accumulator; and reference numeral 66 is an output path for transmitting a control signal of the repeat counter 9.
An operation of the above-described digital signal processor will now be described. In response to the address output from the program counter 2 via the output path 51, the instruction word read from the instruction memory 1 is inputted via the output path 52 to the instruction execution control unit 3. Based upon the decoded instruction, the instruction execution control unit 3 controls the operations by sending the control signal via the output path 53 to various sections.
The internal data memory 4 reads at most two pieces of data to the data bus 5 via the output path 54, and the multiplier 6a outputs the multiplication results with respect to two pieces of input data which has been supplied from the data bus 5 via the output path 55. The selector 64 selects either the output data which has been supplied from the multiplier 6a via the output path 56, or the output data which has been supplied from the data bus 5 via the output path 57. The selector 65 selects either the output data which has been supplied from the data bus 5 via the output path 59, or the output data which has been supplied from the accumulating register 8 via the output path 60.
The accumulator 7 adds the output data which has been supplied from the above-described selector 64 via the output path 58, to the output data which has been supplied from the selector 65 via the output path 61. The calculation result of the accumulator 7 is written via the output path 62 into the accumulating register 8.
It should be noted that the same instruction such as the above-described accumulation is carried out in such a manner that in accordance with the output data which has been supplied from the data bus 5 via the input/output path 63, the number preset in the repeat counter 9 can be repeated.
In accordance with the above-described arrangements, FIG. 3 shows a flowchart for explaining an operation in which a block which has a minimum distortion with respect to a block "A" of a certain data series, is detected among search blocks of m in number as shown in a data relationship diagram of FIG. 2.
An amount of distortion is calculated by equation 1: ##EQU1## where, the block A is: x=(x.sub.1, x.sub.2 . . . , x.sub.w)
the search blocks are: y.sub.k =(y.sub.k1,y.sub.k2, . . . , y.sub.kw) k=1.about.M PA1 a). Since no direct data transfer is carried out between the internal data memory and external data memory, the processing efficiency of the internal calculation is lowered. PA1 b). When the external data memory is accessed by way of the direct data transfer, the address of the external data memory is simple increasing sequence and the transfer word number cannot be arbitarily designated, so that it is difficult to directly transfer the two-dimensional block data. PA1 c). Since the internal calculation of the processor is interrupted when the direct data transfer is carried out, the processing efficiency of the internal calculation is extremely lowered. PA1 d). Since the external address output is fixed at 12 bits, the accessing region of the external data memory is narrow. PA1 y.sub.0 =y.sub.01, . . . , y.sub.0, PA1 y.sub.1 =y.sub.11, . . . , y.sub.1. PA1 a minimum distortion register for holding a minimum distortion; PA1 a minimum distortion position register for holding a number of a block having said minimum distortion; PA1 a block counter for holding a number of a block performing a present distortion calculation; PA1 a comparator for comparing an accumulator output with a value of said minimum distortion register at every cycle while in order to detect the minimum distortion among "M" blocks in number (M being a positive integer) of data train, the distortion calculation is performed with respect to a k-th block (1.ltoreq.k.ltoreq.M, "k" being an integer) of the "M" blocks in number; and, PA1 an instruction execution control unit for controlling operations such as decoding and calculating of an instruction word which is read from an instruction memory in a predetermined order; PA1 a calculation unit for performing various calculations on two input data which have been transferred from a data bus; PA1 an internal data memory for storing a calculation result which has been transferred via a data output bus; PA1 an external data memory connecting unit for reading data from an external data memory to said data bus and for writing the data on said data outputted bus into said external data memory by using values outputted from an address generating unit which generates one output address value and two input address values in parallel for said calculation unit; PA1 a direct memory transfer bus for connecting one port of said internal data memory to said external data memory connecting unit; and, PA1 a direct data memory transfer control unit for inputting and outputting the data in units of blocks between said external data memory connecting unit and said internal data memory via said direct memory transfer bus, independent of the internal operation controlled by said instruction execution control unit. PA1 a control circuit including a program counter for address-controlling a fetched instruction; PA1 a data memory for inputting/outputting data; and, PA1 a data decision unit for selecting one of an output from an arithmetic calculator within a calculating unit, an output from a logical shifter, and an output from a multiplier in parallel with an operation of the calculating unit; for simultaneously comparing the selected output data with threshold values of "n" in number (n being an integer not less than 1); for judging in which region said output data is present among data regions that are subdivided into (n+1) regions by said threshold values of "n" in number based upon comparison results of "n" in number; for sequentially comparing said comparison result with region limiting conditions of "m" in number (m being an integer less than 1) for designating a preset data region and for outputting branch address information corresponding to a consistent region limiting condition among preset branch addresses of "m" in number corresponding to said region limiting conditions of "m" in number in case of one of said conditions is consistent, or for outputting a signal which indicates discrepancy in all of said conditions in case all of said conditions of "m" in number are discrepant. PA1 a plurality of register preserving memories for preserving each of the register data when the interruption is performed; PA1 an interruption controlling unit for correctly transferring data to each of said registers at returning from the interrupting operation, and for controlling the complete recovery from the interrupting operation by restarting executions by remaining repeat numbers even after returning from an interruption which has occurred on the way to repeat processing; and, PA1 an interruption enable controlling unit for forming an interruption inhibiting period to inhibit a H/W interruption other than the interrupting process. PA1 setting as a search small-region, a first motion vector search range having a predetermined size and having, as a center thereof, a position of an input data block to be encoded which is a motion vector search range in the previous frame data; PA1 equally subdividing this first search range into a plurality of regions to obtain motion vectors to be calculated; PA1 allocating first search motion vector groups of "n" in number (n being an integer not less than 1) to the respective regions at a low density; PA1 calculating a distortion of each of the motion vectors, which represents a pattern similarity degree between the block data of the position indicated by this motion vector and the input data block functioning as a present input block, and for summing results corresponding to the motion vectors of "n" in number to obtain the distortion amount within the region; PA1 detecting a region where the distortion amount becomes minimum within the first search region; PA1 setting as a minimum distortion region, a region where a distortion amount within this region becomes minimum; PA1 setting as a limited search range, a second motion vector search range having a size smaller than that of the first search range with respect to the minimum distortion range as a center thereof; PA1 allocating second search motion vector groups at a higher density within the second search range; and PA1 detecting a block which is most similar to the input data block based upon a minimum distortion amount with respect to the second motion vector groups, whereby both the block providing this minimum distortion and the motion vector thereof are a final prediction signal and a motion vector.
"M" and "W" are fixed integers.
That is to say, with respect to the output data of x.sub.h, y.sub.1h which have been read from the data memory 4 of the respective blocks, the accumulating calculations are performed by the number of the data (steps ST 11, ST 12), the distortion comparison is performed after M numbers of the respective block's distortions are obtained, and thereafter a minimum distortion and a block number thereof are obtained (step ST 13).
In this case, the digital signal processor having the arrangement shown in FIG. 1 requires both the comparison and update process by "M" times in order to perform a sum-of-product calculation within one machine cycle, where an amount of calculation becomes (W.times.M) times for the sum-of-product process, and furthermore M times for both the minimum distortion and the block number thereof are needed. As a result, a processing time required for the calculations becomes t.times.(M.times.W+M), where t is one machine cycle.
Since the conventional digital signal processor has been arranged with the above-described constructions, when, for instance, a block having a minimum distortion is detected among blocks having a certain data series and "M" pieces of search blocks, distortions for all of "M" pieces of blocks are calculated, these distortions are compared with each other, and then a block number (position) of a minimum distortion is detected. As a result, there are drawbacks that the number of calculations becomes very large and the required processing time is considerably long.
FIG. 5 is a schematic block diagram of the digital signal processing processor disclosed in "A 50nS FLOATING-POINT SIGNAL PROCESSOR VLSI", P.401, Proceedings of ICASSP 86, 1986. It should be noted that for the sake of simplicity, only necessary blocks are shown in FIG. 5.
In the block diagram of FIG. 5, reference numeral 1 denotes an instruction memory for storing an instruction word; 3 indicates an instruction execution control unit for controlling various operations of decoding the instruction word and calculations; 5 is a data bus for mutually connecting the following sections with each other and for mainly performing a data transmission; 4 is an internal data memory for storing the calculation data; 6 represents a calculating unit for performing various calculations with respect to two pieces of data which have been transferred from the data bus 5; 8 denotes an address generating unit capable of generating at most 3 addresses at the same time; 10 represents an external data memory connecting unit for controlling the read/write operations to an external data memory (not shown); 78 is an external address bus; 79 denotes an external data bus; 80 indicates an external device control signal bus; 81 is a serial port (referred to as an "SIO" hereinafter) for performing a serial data transmission between external devices (not shown in detail); and, reference numeral 82 denotes a direct data memory transfer control unit (referred to as a "DMAC" hereinafter) for controlling a direct data memory transfer (referred to as a "DMA" hereinafter) between SIO 81 and external data memory connecting unit 10.
FIG. 6 illustrates a timing chart of external data memory accessing operations of the digital signal processor shown in FIG. 5. FIG. 6a is a read timing chart and FIG. 6b is a write timing chart. In FIGS. 6a and 6b, reference numeral 291 is an external address terminal; 292 represents a strobe signal for controlling the read timing supplied from the external data memory; 293 is an external data terminal; and, 294 represents a strobe signal for controlling write timing to the external data memory.
An operation of the digital signal processor will now be described. In FIG. 5, the instruction word of the designated address is read out from the instruction memory 1, and input via an input/output path 201 to the instruction execution control unit 3. The control signal and data which have been decoded by the instruction execution control unit 3 are transferred via an output path 202 to the data bus 5.
In response to this control signal, calculation data from the internal data memory 4 to the data bus 5 is read via an output path 203, the data from the data bus 5 is input via an output path 204 to the calculation unit 6, the calculating process and calculation result at the calculation unit 6 is output via an output path 205 to the data bus 5, the data sent from the data bus 5 to the internal data memory 4 is written via an output path 206, and various operations such as the external data memory access are controlled.
Both the address of the input data from the internal data memory 4 to the calculation unit 6 and the writing address of the output data from the calculation unit 6 to the internal data memory 4 are controlled by the address generating unit 8 having three systems of address generators. This address generating unit 8 generates the address with the readable/writable data input from the data bus 5 via an input/output path 210, controls the internal data memory 4 and the external data memory connection unit 10 in response to the data which has been outputted via output paths 208 and 209, and determines the input data and output data write destination to the calculation unit 6.
When, on the other hand, data is set to a specific register of DMAC 82 via the data bus 5 and a path (not shown), DMA is initialized.
Once DMA is initialized, all of operations other than the DMA transfer are interrupted, and the data transfer is carried out from SIO 81 to the external data memory connection unit 10 via the output path 208 and data bus 5. The transfer word number is set into the specific register of DMAC 82 in response to the instruction which has been previously outputted via the output path 201. As the settable transfer word numbers, a selection is made to only 64, 128, 256 and 512 words.
A description will now be made to FIG. 6. When the readout operation of the external data memory is carried out as shown in FIG. 6a, an RE terminal of the external device control signal bus 80 becomes active for 1 machine cycle, the strobe signal 292 informs the external device of the data readout, and the address data is output from the external address bus 78 for 1 machine cycle. Furthermore, the data read from the external device is fetched at the trailing edge of the same cycle.
When the writing operation of the external data memory as shown in FIG. 6b is carried out, a WE terminal of the external device control signal bus 80 becomes active for 1 machine cycle, the data writing operation is announced to the external device, the address data is output from the external address bus 78 and the write data is output from the external data bus 79 for one machine cycle.
Since the second conventional digital signal processor is arranged as described above, the following problems exist:
FIG. 7 is a schemtic block diagram of the conventional digital signal processor (referred to as a "DSP" hereinafter) chip employed in the digital signal processor disclosed in IEEE, ICASSP 86, publications on page 401 "A 50nS FLOATING-POINT SIGNAL PROCESSOR VLSI". It should be noted that for the sake of simplicity, only necessary blocks are illustrated in FIG. 7. In FIG. 7, reference numeral 1 indicates a program memory for storing a microprogram by which all of processes of DSP are performed; 3 indicates a control circuit for controlling the executions of various processes such as fetching and decoding of the microprogram of the program memory 1, reading of data, calculation, and writing of calculation results; 4 represents 2-port data memory capable of storing 2 n bits (n is a positive integer) data as the data size, also of simultaneously reading two pieces of data, and also of writing one piece of data; 8 indicates an address generating unit for generating an address for the data memory 4; reference numerals 301 and 302 represent selectors; reference numeral 303 a multiplier circuit for performing a multiplication process and adding/subtracting process with respect to two pieces of data X and Y which are simultaneously read from the data memory 4 and supplied via the respective selectors 301 and 302; reference numeral 6 is a calculation unit for performing an arithmetic operation and accumulation with respect to the above-described two pieces of data or resultant data by the multiplier circuit 303, and, reference numeral 5 indicates a data bus for transferring both the above-described two pieces of data X and Y, and the resultant data by the calculation unit 6 between the calculation unit 6 and data memory 4.
An operation of the digital signal processor will now be described. First of all, an overall operation of the DSP shown in FIG. 7 will be described. That is, the address generating unit 8 generates the address with respect to the data memory 4 so as to supply to this data memory 4. Thereafter when the data is read out, two pieces of data are simultaneously read out from the data memory 4, and then supplied via the respective selectors 301 and 302 to the multiplier circuit 303 or calculation unit 6 as the data X and Y. At this time, the multiplier circuit 303 performs the multiplication process on these data X and Y, and also sum-of-product processes on the multiplication result, and finally supplies the resultant data to the calculation unit 6. Then, the calculation unit 6 perform such an arithmetic calculating process that summation, subtraction, and bit manipulation are executed to this resultant data or the above-described two pieces of data X and Y, and also supplies the resultant data to the data memory 4 via the data bus 5 for writing. The above-described series of processing operations are performed by a pipeline process in which the control circuit 3 reads the microprogram which has been stored in the program memory 1, the instruction is decoded by the control circuit 3, and the control signal 31 is output to the respective circuits.
Then, for the case where a sum-of-product calculation, a complex number calculation, and a binary three search vector quantizing calculation are executed in the DSP, descriptions of a required machine cycle number will now be made.