1. Field of the Invention
The present invention relates to a memory interface apparatus responsive to received data for accessing a data memory and outputting the result of the access, and more particularly, to a memory interface apparatus responsive to reception of data to which number information is allocated in an input time series order for accessing a data memory by addressing using the number information and the like, and for processing and outputting the result of the access. It is noted that a data memory herein is a memory for storing data to be referred to or updated at the time of execution of processing in an information processor.
2. Description of the Background Art
Parallel processing is effective when high speed processing of a large amount of data, such as video signal processing, is desired. A so-called data driven type architecture has particularly attracted attention among parallel processing architectures.
In a data driven information processor, processing is performed according to a rule "when all the data needed for an operation are available and resources such as an operation unit required for the operation are assigned thereto, the operation is carried out".
In processing time series digital signals such as video signals, the same processing is often applied to each time series signal. Therefore, in a digital signal processing data driven information processor, a dynamic data driven method in which the same processing flow can be carried out with each time series data separated from each other.
FIG. 9 is a block diagram showing a conventional data driven information processor for digital image signal processing. FIGS. 10A and 10B are diagrams showing formats of data packets applied to a conventional example and an embodiment of the present invention.
FIG. 11 is a diagram showing a field configuration of a generation number in a data packet. FIG. 12 is a diagram showing an example of a logical arrangement of a data memory based on the field configuration of the generation number shown in FIG. 11.
The data packet shown in FIG. 10A includes an operation code C, a node number ND, a generation number GN corresponding to a time series order as described above, and data D. The node number ND is a number which is allocated to each node indicating each processing step (each operation code C) carried out in a data flow graph representing a data flow program. The data packet shown in FIG. 10B is the same as that shown in FIG. 10A except that the data packet of FIG. 10B includes first and second data D1 and D2 instead of data D in the data packet of FIG. 10A.
The data driven information processor of FIG. 9 includes a junction unit J for receiving and sequentially outputting a data packet; a waiting control unit FC for receiving a data packet, waiting data and producing paired data or a constant; a data memory MEM; an operation unit FP for performing operation processing and accessing memory MEM; a storage unit PS for pre-storing a data flow program of each type of processing such as video signal processing; and a branch unit B for receiving and outputting a data packet. These units are connected to each other through an internal pipeline shown by a thick solid line in the figure.
Referring to FIG. 11, generation number GN is data of a fixed length, and consists of an m-bit field address FD#, an n-bit line address LN# and a 1-bit pixel address PX#.
The content of generation number GN of FIG. 11 corresponds to the logical arrangement of data memory MEM shown in FIG. 12. Memory MEM serves as an image memory if the information processor performs video signal is processing and the logical arrangement thereof includes 2.sup.m fields each specified by an m-bit field address FD#, each field includes 2.sup.n lines in a vertical direction each corresponding to an n-bit line address LN#, and each line includes 2.sup.1 pixels each corresponding to a 1-bit pixel address PX#.
If processing according to a data flow program for image signal processing stored in storage unit PS in FIG. 9 is to be carried out, a data packet from the internal pipeline and a data packet of FIG. 10A externally applied to an information processor are first supplied to packet junction unit J, and sequentially output to waiting control unit FC. Waiting control unit FC receives a data packet, and waits for a data packet or fetches a pre-stored constant in order to produce paired data in view of a generation number GN of the received data packet, and thereafter, produces such a data packet as shown in FIG. 10B and outputs the data packet to operation unit FP.
An operation code C, a node number ND, a generation number GN and data D of input packet to unit FC are set in an operation code C, a node number ND, a generation number GN and first data D1 of the data packet of FIG. 10B produced in waiting control unit FC, respectively. If paired data is produced, data paired with data D is set in second data D2 of the data packet of FIG. 10B produced, and if a constant is fetched, the fetched constant is set therein.
Operation unit FP receives an applied data packet, and performs operation processing of first or second data D1 or D2 of the received data packet or accesses memory MEM according to the result of decoding an operation code C of the received data packet, and thereafter, outputs a data packet having a format of FIG. 10A to program storage unit PS. Normally, the result of operation processing or access to a memory is stored as data D in a data packet output from operation unit FP.
FIG. 13 is a block diagram illustrating access to a memory in a conventional operation unit FP, and FIG. 14 is a diagram illustrating an address modification step for access to a memory in FIG. 13.
In FIG. 13, a memory MEM is located in the center of the figure, for convenience. Blocks in FIG. 13 include an address modification unit amd, a memory access unit i/f, and a control unit Cn for controlling these units. In the case of performing access to a memory, a generation number GN in a received data packet is processed in address modification unit amd based on the step of FIG. 14, so that a logical address of two dimensions (line, pixel) by n planes (fields) indicated by the generation number GN is converted into a physical address of the memory MEM by address modification using an offset indicated by second data D2 of the received data packet.
In program storage unit PS, the subsequent operation code and a node number corresponding to that operation code are fetched from a pre-stored program based on a node number ND of a received data packet to be respectively set in the received data packet as an operation code C and a node number ND, and then, the received data packet is output. A data packet output from program storage unit PS is output externally or is output to the internal pipeline from branch unit B.
The above described data driven information processor has been to have an architecture suitable for digital image signal processing, since it can access two dimensions by n planes logically arranged at the time of performing access to a memory MEM, based on a generation number GN corresponding to time series data, that is, it can access, for each data packet, an address of a memory MEM corresponding to a scan position of an image signal.
In the arrangement of the above described operation unit FP, however, access to a memory MEM (see FIG. 13) and operation processing are carried out independently of each other in function, and therefore, filter operation, correlation operation and the like using the result of reference to a memory MEM, which are frequently used in image processing, must be performed in a plurality of nodes. Inclusion of a number of nodes in a data flow graph may cause an internal pipeline in an information processor to be crowded with data packets at the time of execution of processing according to the data flow program. In addition, overhead would be produced for waiting for data for operation processing of the resultant data of access to a data memory MEM. Consequently, improvement in throughput of processing has been difficult.
A technique has been proposed in U.S. application Ser. No. 08/215,564, now U.S. Pat. No. 5,502,834, in order to solve these problems. An operation unit FP is partially improved in this technique.
FIG. 15 is a block diagram illustrating access to a memory in an operation unit FP disclosed in U.S. application Ser. No. 08/215,564, now U.S. Pat No. 5,502,834. A memory MEM is located in the center of the figure, for convenience. Blocks in FIG. 15 are different from those in FIG. 13 in that blocks in FIG. 15 additionally include an arithmetic and logic unit alu for receiving and processing the result of access to a data memory MEM and include a control unit Cn1 instead of control unit Cn, in that an output of the arithmetic and logic unit alu is fed back to a path for access to a memory MEM, and in that each unit is controlled by control unit Cn1. With the structure of FIG. 15, simple access to a memory MEM, operation processing of resultant data of the access and data applied to operation unit FP, and update of the content of memory MEM using the result of the operation processing can be realized with a single received operation code C.
Use of an operation instruction which compounds access to memory MEM as described above allows operation using the result of access to a memory, which is frequently used in image processing, to be carried out with fewer nodes and without waiting for data in waiting control unit FC. Accordingly, reduction in the number of nodes causes reduction in the amount of data packets flowing in the information processor upon processing for each generation, and no waiting for data permits data packets of more generations to be supplied to the information processor, so that throughput of processing could be improved.
In the arrangement of FIG. 15, however, since a data packet which stores an intermediate result of correlation operation circulates through a pipeline or an intermediate result is temporarily stored in a data memory MEM, improvement in operation efficiency has not been easy. In addition, since processing is performed in a time direction in order to avoid waiting for data, it has been difficult to utilize parallelism in processing only by a function as shown in FIG. 15. It is noted that parallelism in processing herein means that processing includes a plurality of basic processings (access to a memory or operation) which can be carried out in parallel.
A further improved operation unit FP has been proposed to solve the above described problems.
FIG. 16 is a block diagram illustrating access to a memory in a proposed operation unit FP. A memory MEM is located in the center of the figure, for convenience.
FIGS. 17 and 18 are diagrams illustrating address modification steps for access to a memory in FIG. 16.
Blocks in FIG. 16 includes two memories MEMs, memory access units i/fs and address modification units amds corresponding to respective memories MEMs, a selector sl, an arithmetic and logic unit alu, and a control unit Cn2 for controlling each unit.
This arrangement enables not only two memories MEM to be accessed simultaneously (in parallel) in the case of FIG. 16, but also access to each memory MEM, operation using the result of the access, and update of the content of each memory MEM using the result of the operation to be carried out with a single operation code C. In addition, when memories MEMs are to be accessed, a single generation number GN corresponding to two dimensions (pixel, line) by n planes (fields) is converted-into two different physical addresses by address modification using different offsets indicated by second data D2 by means of the step shown in FIG. 17 or 18.
Use of an operation instruction which compounds parallel access to two memories MEMs as described above allows operation processing using the result of reference to a memory, which is frequently used in image processing, to be carried out with fewer nodes, so that data packets of more generations can be supplied to an information processor, resulting in improvement in throughput of processing.
It can be said that the apparatus of FIG. 16 has an architecture suitable for digital image signal processing since scanning of the content of a memory MEM corresponding to a scan position of an image signal and operation which compounds parallel access to two memories MEM can be carried out in an order of received data packets. However, parallel access to data memories MEMs is limited to at most two memories MEMs for a single operation code C, and therefore, parallelism in processing is not used sufficiently. Accordingly, since filter operation or correlation operation using at least three data must be processed serially using a plurality of instructions, it has been difficult to improve throughput of processing.
In addition, as shown in FIGS. 14, 17 and 18, a part (second data D2) of a field of a data packet is directly used as an offset of an address for access to a data memory MEM. Therefore, in particular when two different data memories MEMs are to be accessed in parallel, the range of an offset value must be reduced, or either a pixel offset or a line offset must be common to both data memories MEMs. With this restriction, even if two different data memories MEMs could be accessed in parallel, access to addresses, which have different pixels and different lines, of the memories MEMs to be accessed or access to a portion other than the region of the memories MEMs which can be subject to address modification using an offset must be processed serially with two instructions, so that parallelism in processing has not been used sufficiently.