1. Field of the Invention
This invention is in the field of general-purpose digital data processing systems having a central processor which is optimized to achieve maximum throughput. The central processor includes a plurality of execution units, each of which executes a different subset of the instructions constituting the repertoire of the processor. The execution units are independent of each other and act in parallel, or the execution of the instructions of a given program is overlapped. An instruction fetch unit fetches instructions from a cache unit and stores the instructions in an instruction stack. The central pipeline unit fetches the instructions of a program in program order from the instruction stack and in a series of sequential steps identifies the instruction, forms the address of the operand from the address portion of the instruction, and obtains the operand from the operand cache portion of the cache unit. The operand and execution command is then distributed to one of the execution units for processing. A collector is provided in which the results of the execution of instructions of a given program by the execution units are received and stored in program order together with a master copy of the program-addressable registers of the processor. Store instructions are executed by the collector which writes operands or data words into the operand cache. Data so modified is retained in the cache until need for that data is established outside of the central processor, at which time the data is written into the RAM memory of the system.
2. Description of the Prior Art
Typically, in large-scale, general-purpose digital data processing systems, the central processor of such a system includes circuits for producing the addresses of instruction words of a given program in the memory of the system, for fetching the instructions from memory, for preparing the addresses of operands, for fetching the operands from memory, for loading data into designated registers, for executing the instructions, and, when the results are produced, for writing the results into memory or into program-visible registers.
To increase the performance of the processors and of the systems as a whole, i.e., throughput, of data processing systems, various modifications have been incorporated into the central processing units. To reduce the time required to obtain operands and instructions, high-speed caches located in the processor have been provided. To speed up the systems, the systems are synchronized, i.e., a clock produces clock pulses which control each step of the operation of the central processing unit. In pipelined processors, the steps of preparing and fetching the instructions and operands of a program are overlapped to increase performance.
However, there is always a need for more or greater throughput and preferably without changing the instruction repertoire or the internal decor of the processors of such systems so that such high-performance processors are capable of executing existing programs without the necessity for such programs to be modified or changed, or such processors are compatible with the programs written for earlier systems.