The present invention relates to a vector processor of a kind associated with an address translation therein, and in particular, to a buffer storage control of a scalar processor in the vector processor of said kind.
Conventionally, the vector processor is provided with a scalar processor therein and with a buffer storage to increase the processing speed of the scalar processor in many cases. The scalar arithmetic section is connected via the buffer storage to the main storage as described in "Super Computer SX System with Maximum 1.3 GFLOPS and Machine Cycle of 6 ns", Furukatsu, Watanabe, and Kondo; Nikkei Electronics Nov. 19, 1984; pp. 237-272.
With the provision of the logical constitution above, the scalar arithmetic section is capable of reading or writing operands of an arithmetic operation and a processing result from or into the buffer storage and hence the scalar processing speed is expected to be increased. The vector processor is provided with a vector processing function such that portions of the program for which the vector processing is possible and the remaining portions thereof are processed in the vector processor and the scalar processor, respectively. Consequently, it is considered that the processing speed of the program can be improved by increasing the processing performance of the respective processors. This forecast is appropriate if there does not exist any interaction between the vector processing and the scalar processing. In general, however, there exists an interaction therebetween and hence the forecast above is not correct in some cases depending on the algrorithm.
Next, description will be given of the interaction between the vector processing and the scalar processing.
In a DO loop, if there does not exist any dependence relationship between a control variable of the DO loop and each processing therein, the DO loop is considered to be subjected to a vector processing. On the other hand, if the dependence relationship is found in the processing of the DO loop, the vector processing is impossible or the DO loop is called a scalar-type DO loop. In some scalar-type DO loop, the vector processing is possible in a partial processing of the DO loop. The DO loop of this type is called a partially vectorizable loop. The following DO loop of a FORTNAN program can be partially vectorized, namely, this DO loop is a partially vectorized loop.
______________________________________ DO100 I = 1, N .circle.1 IX = R1 (I) .circle.2 IY = R2 (I) .circle.3 A(IX, IY) = A(IX, IY) + S(I) .circle.4 100 CONTINUE .circle.5 ______________________________________
The loop above is rewritten by use of a machine language as follows.
______________________________________ Set Vector length `N` (a) Vector Load VR (vector register) 0 .rarw. R1 (b) Convert real to integer VR1 .rarw. VR0 Vector Load VR2 .rarw. R2 (c) Convert real to integer VR3 .rarw. VR2 Vector Subtract VR4 .rarw. VR3 - `1` (d) Vector Multiply VR5 .rarw. VR4 * M M indicates Vector Add VR6 .rarw. VR5 + VR1 the dimension Vector Store VR6 .fwdarw. ADRS (work address of the array rector) A. Post (e) Load GR (general register) .rarw. base (ADRS) Load Gr11 .rarw. base (A) (f) Load Gr12 .rarw. base (S) Subtract GR9 .rarw. GR9 - GR9 Wait (g) LBL (Label): base = GR10 Load GR8 .rarw. ADRS (h) index = GR9 base = GR11 Load GR7 .rarw. A (i) index = GR8 base = GR12 Add GR7 .rarw. GR7 + S (j) index = GR9 base = GR11 Store GR7 .fwdarw. A (k) index = GR8 Add GR9 .rarw. GR9 + `1` (l) BCT (branch), LBL (m) ______________________________________
In the example above, (a) indicates an operation to set the number of vector processing elements to the vector processor and (b) designates a processing of a type conversion on vector data R1. In addition, (b) corresponds to processing .circle.2 of the FORTRAN program, (c) denotes a type conversion processing of a vector Rs, (d) stands for a processing to compute an indirect address of an array A and to store the computed address in a work vector ADRS, and (g) indicates to wait for the completion of the processing of which the execution is initiated prior to the instruction (b). When compared with a scalar instruction, a vector instruction requires a processing time corresponding to the processing elements. Consequently, there is not adopted, in many cases, a control procedure to execute a vector instruction after the completion of the preceding instruction, namely, the processing is controlled such that a concurrent processing is effected for a plurality of vector instructions which can be logically executed. Description will be here given on the assumption that there is provided a vector processor capable of the latter control. A group of instructions (f) between the instructions (e) and (g) are executed so as to be overlapped with an execution stage of the vector instruction (d). The scalar processing from (h) to (m) is achieved to add a vector S to the array A by use of an indirect addressing. Since the value of an index irregularly varies due to the indirect addressing, the dependence relationship of data is established with respect to the DO control variable, and hence the vector processing is impossible in this processing example. Incidentally, the instructions (c) and (d) and (h to 1) respectively correspond to .circle.3 and .circle.4 and of the FORTRAN program. Namely, the FORTRAN loop is a partially vectorizable loop. In this loop, the vector processing and the scalar processing have an interaction in which address data is passed from the vector processor to the scalar processor via the work vector ADRS.
FIG. 2 shows the processing concept of the vector processor associated with a data transfer between the vector processor and the scalar processor. In FIG. 2, a vector arithmetic or logic operation is accomplished between a vector arithmetic unit 501 and a vector register 502. In the case where a result of a vector arithmetic operation is stored in a main storage 700, a store address is processed by a logical (L)-real (R) address translation unit 503 in a vector processor 500 so as to be translated from a logical address into a real address. Data to be stored is stored from the vector register 502 into the main storage (MS) at a real address generated by the address translation unit 503. At the same time, a store real address is transmitted from the address translation unit 503 to an address array section 605 controlling read address information of data in a buffer storage (BS) 602 in a scalar processor 600. The address array section 605 compares the address where the store data has been written in the main storage 700 with an address in the buffer storage 602. When a matching condition is found as a result of the comparison, a buffer cancel request is transmitted to an address translation table 603. A scalar arithmetic operation is accompished between a scalar arithmetic unit 601 and the buffer storage 602. If data necessary for the arithmetic operation is missing in the address translation table 603, a data read request is issued via an address translation logic 604 to the main storage. This is because the (L) read request passes through the address translation table 603 so as to be issued to the address translation logic 604 if the logical address is not registered as an entry in the address translation table 603. Data read out from the main storage 700 is stored in the buffer storage; furthermore, the address array and the address translation table 603 are updated.
As described above, due to a read request issued from the scalar processor 600 to access the area of the main storage 700 where the data has been written by the vector processor 500, the processing speed cannot be increased in the scalar arithmetic operation to access the data by use of the buffer storage 602.
In contrast with the processing above, in the case where the vector processor 500 reads data from the area of the main storage 700 in which the data has been written by the scalar processor 600, if a storethrough method of a type in which the data write operation is effected on the buffer storage 602 and the main storage 700 at the same time is adopted, the operation to read the vector data is not hindered because information of the buffer storage 602 is reflected on the main storage 700. However, in the case of a store-in method in which the store data is not written in the main storage 700 and is only written in the buffer storage 602, a considerable period of time is required to read the vector data.