The particular function obtained by any computer program is dependent on the "conceptual sequence" of its instructions, i.e. the instruction sequence written into the program. The "conceptual sequence" of memory store and fetch operations is likewise determined by the written sequence of instructions. Thus, each program is expected to handle its instructions and their fetches and stores in their conceptual sequence. Thus, conventional CPUs maintain the conceptual sequence of fetches and stores during program execution to provide the program results in system storage which the program designer expects of the program.
In a multiprocessing (MP) system, erroneous data may be fetched by any processor in the MP system if operand fetches by a processor are allowed to have an order different from the operand order specified by the instruction sequence of the executing program.
The erroneous data problem is illustrated by the simple case shown in FIG. 1A, as follows:
1. Programs A and B are being executed on different processors a and b respectively in the MP. Each processor completes fetches and stores in the order specified by the program it is executing, i.e. each processor accesses its operands in the conceptual order of its respective program.
2. Program A has an instruction sequence that includes a store into a location x (i.e. STx) followed by a store into a location y (i.e. STy). The store order is STx . . . STy.
3. Program B has a load instruction Ly that loads data Y from location y followed by a load instruction Lx that load data X from location x. The load instruction order is Ly . . . Lx, which is the reverse of the store instruction order STx . . . STy by the other program on the other processor.
4. Case 1 through case 6 shown in FIG. 1A represent all possible combinations (YX, YX' or Y'X') of the operand data values fetchable by program B from locations x and y.
5. Which combination YX, YX' or Y'X' happen to be fetched by program B from locations x and y is dependent on the time that program A does its stores, in relation to the fetching by program B. Any fetched combination is architecturally correct data if both programs A and B have accessed their operands in conceptual sequence.
6. But if any of the operands in program B is not accessed in its conceptual sequence (i.e. out-of-sequence, OOS), the OOS condition can cause erroneous data to be fetched, rather than the architecturally correct data required by the programs under the conceptual sequence architectural rules.
7. For example if the conceptual sequence of programs A and B would cause case 1 to happen, the required resultant data is YX. But if in case 1 the first operand fetch of data Y is delayed by a cache miss, and the second operand fetch of data X is accessed in the cache without delay, then operand Y is obtained after the store to location y changes data Y to Y'. Hence, the OOS condition causes the architecturally impossible combination Y'X to be fetched instead of the combination YX required by the conceptual sequence architectural rules.
To avoid violating the conceptual sequence architectural rules, prior computer systems maintained the conceptual sequence by not starting the execution of the next instruction in a program until the execution was complete for the adjacent prior instruction in the program sequence. Thus, a memory fetch or store for a next instruction in the program sequence was delayed until execution was completed for the prior instruction in the program sequence. All memory fetches and stores within any instruction were executed in the order specified by the architecture of the respective instruction.
However, the prior art discloses special cases where a CPU changed the actual sequence of fetching and storing operands from their conceptual sequence, and still got the correct program results. One prior technique was to detect any dependency on a prior store operand. This was done by comparing the address of each operand fetch request with the address of each prior uncompleted operand store request, and if none compared equal, no prior store dependency conflict existed.
Large CPUs have for many years used particular types of instruction overlap. Such overlapped execution required various techniques, which allowed some degree of out-of-sequence execution. The prior overlapping techniques took many different forms, and each technique had it own control problems. Some of these techniques were used in pipelined CPUs including CPUs having multiple execution units. They used various types of dependency detection techniques to allow multiple instructions in various states of execution to avoid certain types of problems that could be encountered in overlapping their execution. These techniques used control logic to detect dependencies between instructions to assure the same execution results as would be obtained if these instructions had executed in a non-overlapped manner, one at a time in their conceptual sequence.
The prior systems used instruction-completion controls for controlling instruction overlap, which recognized the end-of-execution for each instruction, to correlate the fetch and store operands of the respective instructions.
Interruptions to a program have been conventionally allowed on the completion of the execution of most instructions, and before starting the execution of the next instruction in the sequence. All outstanding fetch and store operands must be obtained before any instruction execution can be completed, and before an interruption can be started in relation to such instructions, e.g. the interruptions are serialized with the instruction stream. Serialization prevents program interruptions from interfering with the sequencing of operand fetches and stores. Only instructions requiring long execution times have been allowed to have interruptions before their completion, and only at the completion of the then outstanding fetches and stores, which defined temporary instruction stopping points at which interruptions could be allowed.
Also in the prior art, the tagging of a operand fetch request to memory was done to assure the proper receipt of fetched data by a subset of CPU registers reserved for receiving the fetched data required in the execution of an instruction. When the fetch data was obtained from memory, and put on a common data bus to all registers, each register had a compare means to compare tags transmitted with the fetched data on the bus with the tags stored at the reserved registers. Only on compare-equal conditions was fetched data allowed into the reserved subset of registers.
Also in prior systems, certain instructions were not allowed to use any overlap, such as the serializing instructions in the S/370 architecture that cannot start execution until all prior instructions have completed execution. A serialization operation includes completing all operand fetches and stores by prior instructions in the program sequence observed by other CPUs and by channel programs. Examples of such instructions are the S/370 compare and swap, test and set, etc. Many other serializing instructions are described in the IBM ESA/370 Principles of Operation (form number SA 22-7200-0) on pages 5-76 and 5-77.
A U.S. Pat. No. 4,991,090 (owned by the same assignee as the subject application) entitled "Posting Out-Of-Sequence Fetches" discloses a monitoring means for a CPU execution unit that detects when a fetch request may have its data returned out of the conceptual sequence of the instructions which issued the respective fetch requests. A table (or stack) has entries for memory fetch requests. Each table entry contains fields representing a fetch request, including the memory address, a tag identifying its instruction, a full/empty flag to indicate if the fields in the entry are full, and a valid flag bit to indicate if a full entry is valid. Each entry remains in the stack until it is invalidated. When made invalid, an entry may then be used for a new fetch request. Test addresses are provided by store addresses, and by cross-invalidate (XI) request addresses. An entry is marked invalid if its fetch address field compares equal with the test address, in the manner of an Operand Store Compare operation. But all entries in the stack are marked invalid upon the occurrence of a cache miss or a serializing event. The invalidation of an entry indicates it represents a fetch request that may be out-of-sequence.
A particular operand store compare problem is described in an article entitled "Handling of Fetches Subsequent to Unexpected Stores" published in the December 1985 issue of the IBM Technical Disclosure Bulletin on pages 3173 and 3174.
The von Neumann computer system basic architecture requires the instructions in a program to be in a "conceptual order" which is required to obtain an intended execution result for the program. This architecture provides operands which may be in the main storage of the system which requires that the operands be fetched from and stored in the main storage (also called system memory, or just memory). These same architectural requirements exist whether a program is executed in a uniprocessor system (UP) or in a multiprocessor system (MP).
As a consequence, the basic von Neumann computer system architecture places the "conceptual order" restriction on the program results, which are obtained by using its temporal relationship of fetching operand data (called "fetches"), with respect to other fetches and with respect to the storing of operand data (called "stores"). Since the result of program execution is affected by its conceptual order of instruction operand accesses, the program execution result must not be changed if any operand access in storage is not in the sequence required by the conceptual order of the instructions. In the past, later operand storage data in the conceptual order has been accessed later in the actual sequence of storage accesses.