The present invention generally relates to general-purpose digital data processing systems, and more particularly relates to such systems that employ memories for storing microcode in an instruction processor. The present invention includes devices and methods for measuring performance of microcoded computer systems.
The integration of modern computer systems has been facilitated by the rapid increase in density of modem integrated circuits and printed circuit boards. The integration of computer systems has a number of advantages, including increased performance, lower power, more reliability, and a reduced cost.
A difficulty with increased integration is that hardware changes may be difficult and/or expensive to correct, particularly during the design cycle of a computer system. Hardware errors may be found, including logic errors, timing errors or any other type of error that reduces the effectiveness of the computer system. These errors are typically found during design verification, but may be found much later, and even after the computer system is shipped to customers.
In the past, mechanical methods were used to make hardware corrections. These mechanical methods include providing jumper wires, re-fabricating a printed circuit board, interchanging an integrated circuit, etc. However, with the increased integration of computer systems, mechanical methods of correcting hardware errors are often not practical (i.e. expensive) or even not possible. A primary source of this difficulty is that the internal hardware is simply not accessible. For example, to correct a hardware error in an ASIC (Application Specific Integrated Circuit) within the design, it may be necessary to create a new set of masks, and re-fabricate the integrated circuit before further verification can continue. This not only can be expensive, but can have a long turn-around time. Likewise, and because many of today""s printed circuit board are multi-layered, it may not always be possible to access a trace to correct a hardware error. Thus, is may be necessary to re-fabricate the printed circuit board before further verification can continue. This may also be relatively expensive and can have a long turn-around time.
For these and other reasons, most modern computer systems use micro-code to control the major data paths and control points within a computer system. This may allow a system designer to provide a work around for many of the errors that are detected by simply modifying the microcode. Thus, many of the hardware errors may be corrected, at least for further verification purposes, by changing the microcode. This may allow the verification process to continue, and the system designer may continue to identify other hardware errors in the design, if any.
After the verification process is completed, the system designer may correct the known hardware errors in a single pass. This may significantly reduce the design cycle time of modern day computer systems. In addition, in many cases only a few functions may be affected by a hardware error, and the work around microcode corrections may be sufficient until the next design revision of the computer system is released.
To implement the microcode control, typical computer systems include an instruction processor that may have an instruction cache, a decoder block, and a microcode RAM. Typically, an instruction is read from the instruction cache, and is decoded by the decoder. The decoder then provides a decoded address to the microcode RAM. A microcode instruction may include one microcode instruction word or be an extended instruction having several main code instruction words executed sequentially. The microcode RAM then provides a corresponding microcode instruction to the data processing system, including a number of control signals for controlling the major data paths and control points therein. External control signals are provided to the address decoding hardware to aid in selecting which microcode instruction should be executed. For example, different microcode instruction words may be executed based on the contents of cache, attempted security violations, and register flag values. The exact route taken through the microcode may vary depending upon external conditions and may vary from execution to execution for the exact same piece of machine code. It may never be known how often certain microcode instruction words are ever executed, or even if they are ever executed. It may be desirable to improve execution of certain microcode sequences by replacing or augmenting the microcode execution with dedicated hardware or specialized circuitry. By measuring the relative frequency of use of various microcode instructions, it may be possible to determine bottlenecks in execution that are likely candidates for hardware acceleration.
What would be desirable, therefore, is a system for counting the number of times selected microcode instructions and instruction words are executed, if they are executed at all. What would also be advantageous is a device for determining the relative number of times each of several microcode branches are taken for a complex instruction execution. What would also be desirable is a method for selecting certain microcode instructions and measuring the frequency that the selected instructions are executed, to determine if optimizing or accelerating execution of these instructions is warranted.
The present invention is preferably used in computer systems having machine code instruction executed through microcode. An illustrative system suitable for use with the present invention includes a machine code register for holding a machine code instruction coupled to a second machine code register for extracting or stripping out the data needed to identify the associated microcode. In one embodiment, the machine code operator portion is extracted along with any needed operand type information to further distinguish the type of operator. The operator data is used as an address into an ID translation table typically implemented in RAM. The ID translation table serves to provide an address into a microcode instruction word table, which stores the actual starting microcode instruction word to be executed. The address into the microcode instruction word table can be passed first to an address generator which can either pass the microcode instruction address through or provide an alternate address, discussed below. Given the address into the microcode instruction word table, one microcode instruction word can be extracted into a microcode instruction word register, which in turn can be feed into a microcode controller for generating the multiplicity of control signals required to execute the instruction.
The present invention preferably includes the use of a bit field in the microcode instruction word and microcode instruction word register, which can have a length sufficient for the purpose of the present invention. The bit field includes an event counter selection field for selecting which, if any, event counter is to be incremented when a corresponding bit is set. The invention includes one or more event counters to count the execution of microcode instruction words having the proper bit set.
In one illustrative system, one bit is used to designate one event counter, such that the number of event counters can be equal to the number of bits in the event counter bit field, and such that more than one bit can be set and counted in different event counters in the same execution. In another system, the number of event counter selection bits is less than the number of event counters, with the bit field being used to encode the number or address of the event counter to be incremented. For example, the bit field may be interpreted as a base two number used to calculate the address of the event counter. Similarly, the event counter bit field may be three bits long and is read by a 3-to-8 decoder to select one of seven event counters to increment, with a zero value meaning no event counters are to be incremented. To add flexibility to the system, the maintenance processor may be connected to the microcode word instruction RAM for downloading modified microcode instruction words, having different bits set in the bit field allowing different microcode instruction words to be counted.
In use, an existing production instruction processor board or boards can be replaced with specialized instruction processor board or boards including the present invention. The specialized board can include a longer microcode instruction word length in both the microcode RAM table and in the microcode instruction register. As indicated above, the microcode instruction words may be downloaded through the maintenance processor into the microcode instruction RAM. Event counter bits are preferably set in those microcode instruction words for which counting is desired. Microcode instructions can be grouped together and given identical event counter bit field values for some applications. With the microcode instruction words loaded into RAM, computer programs can be run to force the microcode to execute. Maintenance hardware can then be used to copy the values of all event counters into a set of event counter save registers in the same single clock pulse, as a snapshot of system performance. This can allow for accurate comparison between the event counter values. The values of the event counters stored in the save registers can be read out serially, over several clock pulses and analyzed.
Accordingly, the present invention can be used in analyzing the number of executions of microcode instruction words where the number or occurrence of even one execution of certain microcode instruction words cannot be determined a priori from examination of machine code source alone. In one example, machine code leads to an initial location in the microcode RAM table which can be branched, depending on the values of external control signals which cannot be known at compile time or load time of the program. Examples of external signals include the presence of an operand in cache or even in memory, the value of arithmetic flags set by a previous operation, the value of security and privilege flags depending on the user and the state of the machine, etc. The branches that are actually taken through the microcode can be counted by inserting event counter bits in the event counter fields of various microcode instruction words and counting how often, if ever, certain microcode instruction words are executed. In another example, some microcode instructions are extended instructions in which one instruction word contains the address of the next microcode instruction word, where the next address can be conditionally chained, depending on the value of the external signals.
The present invention can thus be used to determine the relative frequency of microcode instruction word execution. Instructions that are frequently executed may be selected for optimization or hardware acceleration. Further, specialized instructions that are found to rarely or never be executed during days of testing under conditions at a user site may be removed and/or the support hardware supporting these instructions may be removed from subsequent systems.