Among processor architectures which search after a high computation performance, there is an array processor. The array processor includes, as its typical configuration, a plurality of processing units (termed as PEs) arranged in an array and a plurality of memory banks peripherally arranged. The PEs arranged in an array execute computation in parallel as they receive data from neighboring units. The array processor has a high degree of parallel computation because the plurality of PEs execute the computation simultaneously. With the array processor, a high computational performance may be achieved in comparison with that achieved with the Neumann type architecture.
Since the plurality of memory banks access data simultaneously, it is possible to eliminate the bottleneck in memory accessing in the Neumann type architecture.
In addition, the array processor is able to access data in succession, and hence is suited for processing stream data. An example is DSA (Decoupled Systolic Architecture) proposed by Volker Strumpen et al. (see Non-Patent Document 1).
If, with the systolic array mentioned above, a high speed computation is to be achieved, memory access timing control at the clock cycle accuracy is required for accessing memories in synchronization with the timing of data which is propagated through PEs, in addition to the efficient memory accessing capability that allows for simultaneous accessing of a plurality of memory banks.
For example, Patent Document 1 discloses a configuration, shown herein in FIG. 6, as a computing system that is capable of reducing superfluous processing cycles to improve the memory accessing performance. Referring to FIG. 6, this processing system includes an address conversion circuit (apparatus) (ACNV) 22 in an address interconnection network (ACNCT) 21. The address conversion circuit (apparatus) operates for converting a base address Adr, generated by a processor (PRC) 24, and generates a bank selection signal sel. This bank selection signal sel is generated on the basis of a parameter which is based on the address obtained on conversion address and on the number of addresses of the memory bank. The address conversion circuit (apparatus) outputs the so generated bank selection signal sel to a data interconnection network (DCNCT) 25. This data interconnection network 25 selectively sets a data path between the processor 24 and memory banks 23a to 23f. 
The memory banks 23a to 23f store data regarding computing operations. If an address space has 48 addresses of from adr0 to adr47, for example, the memory banks each have 8 addresses obtained on equal division by six of the number of the addresses.
A plurality of address generation apparatuses 20a to 20f generate addresses corresponding to memory addresses of the memory banks 23a to 23f to output the so generated addresses to the address interconnection network 21.
A plurality of computing apparatuses 26a to 26f execute computing on data transmitted thereto over the data interconnection network 25 to output the computing results to the data interconnection network 25.
The processor 24 generates a succession of addresses, for example, addresses Adr corresponding to the address space, and outputs the so generated addresses to the address interconnection network 21. The processor 24 inputs data read from a desired one of the memory banks over the data interconnection network 25 or inputs the results of computation by the processing units 26a to 26f to perform preset processing thereon. The processor also transmits preset data to the desired memory bank via the data interconnection network 25, or transmits desired data to the processing units 26a to 26f. 
In the computing system of FIG. 6, the address interconnection network 21 allows for accessing the plurality of memory banks 23a to 23f simultaneously. The processing of address generation is taken charge of by the processor and by a dedicated hardware (HW) to improve the efficiency in the amount of computation for address generation.
Non-Patent Document 1:
    Volker Strumpen and two others: ‘Stream Algorithms and Architecture’, Journal of Instruction-Level Parallelism 6, Sep. 4, 2004, pp. 1-31Patent Document 1:    JP Patent Kokai Publication No. JP-P2004-102633A