The present invention relates to a method for optimizing computer system performance and, in particular, to optimizing computer system performance by programming micro word cycle length.
In sophisticated computer systems, especially in larger processors, there is often a need to execute a great number of operations in the shortest possible time. It has been found that as cycle length (i.e., the time required to perform the most simple operation) decreases to the millisecond and even nanosecond range, even a short delay time between executions of operations can become a significant factor in overall system operation. When thousands or millions of operations are performed each second, an inefficient or wasteful segment of time on a cycle level grows to an appreciable degradation of system performance when multiplied by hours, months or years.
In systems that have a plurality of processors or one or more processors used in conjunction with a plurality of other (e.g., peripheral) devices, it would be helpful to predict the amount of time required to perform certain operations, thus eliminating a requirement for processors to be inactive or non-operational while such operations are performed.
The prior art is replete with examples of inadequate solutions to the aforementioned problem. Predetermining the amount of delay time for a processor for each operation, for example, can result in estimating too short a time, in which case secondary devices connected to the processor may have insufficient time to complete their respective operations, resulting in malfunctions. In order to allocate enough time for secondary devices to execute their tasks, too much time may be reserved, resulting in occasional or even chronic delay. Obviously, neither of these cases represents optimum performance efficiency.
The aforementioned problem is especially troublesome in a vector processor, which typically includes a plurality of vector registers, each vector register storing a vector having a plurality of vector elements. A pipeline processing unit is connected to a selector associated with the vector registers for receiving corresponding elements of a first vector from a first vector register and utilizing the corresponding elements to perform an arithmetic operation on the corresponding elements of a second vector stored in a second vector register. The results of the arithmetic operation are stored in corresponding locations of one of the vector registers or in corresponding locations of a third vector register.
As a result of increasing sophistication of computer systems, the need exists to increase the performance of the vector processor portion of the computer system by decreasing time required to process or perform arithmetic operations on each of the corresponding elements of the plurality of vectors stored in the vector registers.
If the vectors include 128 elements, for example, 128 operations must be performed in sequence. The time required to complete operations on all 128 elements of the vector is a function of the cycle time per operation of the pipeline unit as it operates on each of the corresponding elements.
Each operation can require a unique predetermined time period in which to execute. Moreover, each secondary device has certain characteristics. So the cycle length value is a function both of the vector processor operating characteristics and of the secondary device.
U.S. Pat. No. 4,456,964 issued to Olander, Jr. et al and U.S. Pat. No. 4,412,300 issued to Watson et al disclose an electronic calculator that contains micro instructions and codes to perform basic functions of the calculator. The micro instructions include a plurality of coded and non-coded micro instructions for transferring control to an input/output control unit, for controlling the addressing and accessing of a memory unit, and for controlling the operation of two accumulator registers, a program counter register, an extend register and an arithmetic logic unit. The micro instructions also include a plurality of clock codes for controlling the operation of a programmable clock, a plurality of qualifier selection codes for selecting qualifiers and serving as primary address codes for addressing the read only memory of the microprocessor and a plurality of secondary address codes for addressing the read only memory of the microprocessor. The micro words can be programmed for shift register timing.
U.S. Pat. No. 4,439,829 issued to Tsiang discloses a data processing machine having cache memory and a management system therefor. The length of a micro instruction cycle of a central processor varies according to the nature of the micro instruction. To determine the number of pulses to be generated for a cycle, control signals of the micro instruction controlling central processor are input therefrom to a decoder and counter. A hardware decoder is therefore required for operation of the Tsiang system.
U.S. Pat. No. 4,099,229 issued to Kancler discloses a variable architecture digital computer. An increment multiple cycle counter (IMCC) bit or field increments a multiple cycle counter in a control module which is used in operations requiring repetition of a set of micro instructions such as shifting or multiplying. The clock signal in conjunction with a 2-bit micro multiplexer (MMX) field reduces the system clock rate so that operations which encounter extensive logic delays within the computer may be used. In the Kancler system, a value must be first placed in a counter. Then timing information with data can be loaded into each micro word.
It would be advantageous to provide a system for allowing certain time values to be programmed within a micro word so that there will be a minimum amount of time wasted between operations.
It would be advantageous to match or correlate such programmable delay time to the optimal response time or performance time of other components.
Moreover, it would be advantageous to provide a system for predetermining the amount of time per instruction required for complete operation or execution.
It would also be advantageous to provide a system in which delay time or execution time could be integrally carried with the micro word instruction corresponding thereto.
It would also be advantageous to provide a timing or counting mechanism to generate a signal to indicate when a predetermined time interval has expired.
It would further be advantageous to provide a system that allows processor operations to be performed while a timing or counting mechanism measures a predetermined time interval.
It would also be advantageous to provide a system that allows a processor to execute a subsequent instruction when a counting mechanism indicates that the time interval required for execution of the previous instruction has expired.