The present invention relates to a data processing method and system. More particularly, it concerns a parallel processing method and system suitable for processing a very large data array in parallel with use of a plurality of processors.
Some supercomputers and very large-scale computers have not only a first memory called a main memory or main storage, but also a second random access memory. The second random access memory is called an extended storage, system memory, or paging storage in connection with its primary use, and will hereinafter be referred to as an extended memory. The extended memory has a far larger capacity than the main memory although it cannot operate at such a high speed as the main memory. It is primarily provided for high-speed input/output processing. More particularly, it is intended to speed up input/output processing by a factor of hundreds to thousands by placing input/output files on the extended memory rather than on a magnetic disk. For instance, U.S. Pat. No. 4,476,524 entilted "Page Storage Control Method and Means" by David T. Brown et al., issued on Oct. 9, 1984, disclosed a page storage as an example of the extended memory. An example of the computer having the extended memory is found in the Supercomputer S-820 of Hitachi Seisakusho Co. The extended memory of the S-820 has a capacity up to 12 gigabytes and a data transfer rate of 2 gigabytes per second, while its main memory has a capacity up to 512 megabytes and a data transfer rate of 16 gigabytes per second.
The extended memory forms an address space different from that of the main memory. Data on the extended memory cannot be transferred directly to a processing unit, nor can the processing unit write data directly into the extended memory. However, there are provided instructions for directing data transfer between the extended memory and the main memory. The processing unit can use these instructions to transfer the required data from the extended memory to the main memory before processing it, and also to transfer the processed results, after storing them into the main memory, to the extended memory therefrom.
The extended memory is featured in that it provides a memory having a very large capacity and a relatively high data transfer rate at a lower cost than the main memory. For this purpose, it is combined with an access control mechanism different from that of the main memory. More particularly, the extended memory has an addressable data unit size restricted to a large block, for example, 4 kbytes, while the main memory is structured so that individual bytes can be accessed at a high rate. In compensation for that, the extended memory has a mechanism for transferring such a large block of data at a high rate. In order to efficiently transfer the large block of data, it is common to subdivide this large block into small blocks of a few to tens of bytes and to successively transfer these small blocks. In the Supercomputer S-820 of Hitachi Seisakusho Co. mentioned above, too, the addressable data unit size of its extended memory is 4 kbytes, and the instructions for data transfer between the main memory and the extended memory can specify an integer multiple of the 4 kbytes as the amount of data to be transferred. Such a large amount of data is subdivided into small blocks of 32 to 64 bytes, and these small blocks are successively transferred. This accomplishes a data transfer rate of as high as 2 gigabytes per second.
The supercomputers and very large-scale computers are expected chiefly to execute large-scale calculations at a high speed. Therefore, they have to have both a superhigh-speed computation mechanism and a large capacity memory. In order to accomplish superhigh-speed computation, it is known that a parallel processing system is useful in which a plurality of processors can process a single program in parallel. For the large capacity memory, it is not advisable to provide a main memory of a very large capacity since its price is high.
Accordingly, it is desired that the extended memory could be used as a large capacity storage area for computation as well as for high-speed input/output processesing as in prior art. More particularly, it may be contemplated that the extended memory should be used as means to implement a large capacity storage area exceeding the main memory capacity, and further as means to implement a storage area for computation available for cooperation of a plurality of processors in a multiprocessor configuration in order to accomplish superhigh-speed computation by parallel processing. Such a storage area, if actualized, would allow the supercomputers and very large-scale computers to execute more large-scale computation at a higher rate.
However, it is hard to say that the extended memory of prior art has functions enough to implement such a large capacity storage area exceeding the main memory capacity and available for cooperation of processors in a multiprocessor configuration as discussed above so as to speed up large-scale computation. The reason is as follows.
Assume two processors having different main memories cooperate to calculate a single large-scale data array. If a first processor calculates odd-numbered elements of the data array and a second processor calculates even-numbered elements, the two processors can concurrently operate so that they can calculate all the elements in about a half of the time taken for a single processor to calculate the entire array. After this, however, it is needed to merge the odd- and even-numbered elements to complete a single data array. For this purpose, one processor has to transfer all the elements it calculated from its main memory to the extended memory. The other processor, in turn, has to transfer these elements from the extended memory to its main memory to merge them with the elements it calculated to form a single data array, and has then to transfer this single data array to the extended memory. It is impossible to merge the odd-numbered elements with the even-numbered elements on the extended memory with correct relative positioning to form the single data array. Assume the addressable data unit size in the extended memory be 4 kbytes. If the extended memory has 4-kbyte data including the odd-numbered elements from the main memory of the first processor and 4-kbyte data including the even-numbered elements from the main memory of the second processor transferred into the same area thereof, the result will be that the entire 4-kbyte data transferred earlier is simply replaced by the entire 4-kbyte data transferred later.