In systems, so-called open systems in which software interfaces are unified between pieces of different hardware, most of them have recently been configured so that RISC processors are connected to one another by one through a plurality of buses.
In a certain type of operating system (OS) conventionally widely used in such open systems, pages expressed in units of, for example, 512 Bytes (hereinafter called "B") or 4096B are handled as one file to simplify a user interface. Therefore, the transfer of data between an IO buffer area and a user area prepared for the OS increases. Further, the scale of a middle application increases because of a leap increase in the capacities of a disk and a memory. For example, a database process has necessitated the transfer of large amounts of data ranging from a few KB to a few MB on main storage. The transfer of large amounts of continuous data on the main storage will hereinafter be called a "data copy on main storage".
When the data copy on such main storage is performed, the efficiency of reuse of the data is low and hence a cache effect is hardly obtained.
FIG. 1 shows a basic configuration of such an open system.
In FIG. 1, the computer system comprises processors 1 and 2, caches 3 and 4 placed in the respective processors, a processor bus 5, a storage control unit 6, main storage bunk-buses 7 to 10, and main storage banks 11 to 14.
When the processors 1 and 2 are set as those of RISC architecture, a register-to-register arithmetic operation is fundamental and all the data accesses to main storage are executed as the transfer of data between registers and the main storage.
The processors 1 and 2 are electrically connected to the processor bus 5. The processors 1 and 2 are provided with the caches 3 and 4 respectively. Thus, the number of accesses to the main storage is reduced to hold the availability of the processor bus 5 low. The caches are implemented by incorporating them into the processors, providing them outside the processors or connecting both to each other in hierarchical form. Further, a plurality of processors are improved in efficiency owing to a high-speed sequence assurance mechanism for data between the processors, which is called a "snoop mechanism".
With respect to the main storage, particularly, main storage employed in a system having a plurality of processors, the capability (corresponding to the total band width of the main storage bank-buses 7 to 10) of supply of data from main storage is normally designed so as to have the capability corresponding to one to two times the ability (corresponding to a band width of the processor bus 5) to make a request for data from each processor. Therefore, the capability of supply of the data from the main storage is ensured by dividing the main storage into a plurality of banks and interleaving addresses.
In the system shown in FIG. 1, a data copy on the main storage is implemented by the following procedures.
(1) Data is first fetched from the main storage banks 11 to 14 to a register (not shown) on the processor 1 through the processor bus 5 and the cache 3 in accordance with a load instruction from the processor 1.
(2) Next, the data on the register is stored at specified addresses of the main storage banks 11 to 14 in accordance with a store instruction from the processor.
(3) The above processing is repeatedly performed on the required amount of data.
Accordingly, the data is shifted alternately to and from the processor bus 5 without being almost processed. Incidentally, the features about the transfer of the data in such main storage are generally as follows:
(1) Continuous data PA1 (2) large amounts of data ranging from 4 KB to a few MB PA1 (3) No data is processed during data transfer, for example. PA1 (4) It is of importance to an application handling massive scale data like a database that a data group aligned in page units of 4 KB is copied onto areas aligned in different page units of 4 KB.
In the aforementioned prior art, the data copy on the main storage is implemented by repeatedly executing the load instruction and the store instruction issued from the processor. Therefore,
(1) The data copy on the main storage makes up a considerable operation time for the processor.
When the prior art is taken as a system, a bottleneck in the performance assumes the processor bus. Therefore,
(2) Even though there is a possibility that the data copy on the main storage is basically continuous and all the main storage bank-buses can be activated, play will be produced in each main storage bank-bus.
(3) When a plurality of processors are connected to the processor bus, the availability of the processor bus increases due to the data copy on the main storage by one processor, thus leading up to delays in access to the main storage by other processors.
Thus, an object of the present invention is to provide a computer system capable of executing a data copy on main storage with efficiency in asynchronism with and independently of the operation of each of processors. Even when a plurality of processors are connected to a processor bus, a data copy on the main storage by an arbitrary processor can be executed without delaying access to the main storage by other processors.