High speed data processing systems are quite typically used in scientific applications. Such applications, such as numerical modeling, require the execution of extensive numbers of floating point operations. Consequently, the speed of the system, particularly the execution speed of each floating point instruction, is a limiting factor in the utility of the data processing system.
Characteristically, high speed data processing systems are provided with high-speed hardware floating point units that directly perform such floating point operations as add, subtract, compare, and multiply. These systems typically utilize a pipelined architecture providing for a multistaged data flow that is controlled at each stage by its similarly staged instruction through the use of either micro-coded or hardwired control logic.
The multiple stage pipelined architecture allows multiple instructions to be processed generally one stage offset from one another thereby optimizing the utilization of available hardware. Each instruction will progress to a successive stage generally with each clock cycle and with its corresponding result data, assuming that the previous instruction either does not require or has completed any iterative cycling of the instruction within that stage.
The actual data flow through the pipelined architecture is controlled within each stage, for example, by micro-code stored in a control store table. This table is typically a memory device configured to act as a look-up table. The instruction itself acts as a key, or address pointer, into the control store table to select a corresponding micro-code control word. The control word, in turn, enables a specific data path through the stage's circuitry to perform the desired data manipulation function, such as normalizing a floating point operand word.
The bit width of an instruction, or the key portion thereof for a given stage, effectively determines the number of instructions that can be recognized at that particular stage. However, the bit width of the control store table and, therefore, the length of the micro-coded control word corresponding to a particular instruction may be of any desired length. Since each bit in the micro-coded control word can be used to control a particular functional aspect of its associated stage, the micro-code control store may have any desired width necessary to support the desired level of stage complexity and functionality. Accordingly, each stage may implement a substantial number of relatively independent functions substantially in parallel, thus minimizing the number of stages required to implement any given instruction. Thus, the instruction execution speed of pipelined architecture data processing systems is typically greater than that of comparable non-pipelined systems having a similar number of execution stages.
Enhancements in the execution speed of such pipelined architecture systems is naturally desirable. However, the architectural designs for such machines, though varied a between different classes of machines, are generally well developed within each class. Thus, most enhancements of execution speed within a given class are obtained through the use of faster components and improved packaging and processing technology. This, in turn, may require further optimization of the architectural design of the system and, potentially, the development of a new class of machines. Thus, while there remains room for improvement, the incremental advances in speed of execution are increasingly burdened by the cost and complexity of their implementation.