In the last year International Business Machines Corporation introduced a new generation of S/390 ESA CMOS machines known as the G4 generation. There was then introduced a pipelined computer processor which provided for the use of millicode and which in a milli-mode architected state tests the validity of a program status word with a mask stored in a millicode general register (MGR). The mask indicates bits in the program status word which are to be zeros if the word is valid. A logical AND operation is performed between correspondingly positioned bits in the word and bits in the mask and in addition the status of at least one other bit in the word is checked, a bit other than a correspondingly positioned bit.
A milli-mode operation enables implementation of complex functions in a large, hardware controlled, pipelined, general purpose digital computer without a microprocessor. Milli-mode implements these complex functions with the flexibility provided by firmware and avoids a packaging problem introduced by the inclusion of microprocessor hardware. Rather than a microprocessor, milli-mode uses the preexisting dataflow and hardware controlled execution units of a pipe-lined processor to accomplish complex functions. Additional hardware controlled instructions (private milli-mode only instructions) are added to provide control functions or to improve performance. These private milli-mode instructions augment the architected instruction set. Milli-mode routines can intermingle the milli-mode only instructions with architected instructions to implement complex functions. as illustrated by U.S. Pat. No. 5,694,587, issued Dec. 2, 1997. U.S. Pat. No. 5,694,587 describes specialized millicoded instructions for a PSW Validity Test, Load With Access Test, and Character Translation Assist, which were employed in the IBM machine known as the G4 S/390 machine introduced in 1997. Related to U.S. Pat. No. 5,694,587 were additional applications related to milli-code which are implemented in the same G4 S/390 machine introduced in 1997 by International Business Machines Corporation. These were:
application Ser. No. 08/414,821, filed Mar. 31, 1995, entitled "Millicode Read-Only Storage With Entry Point Patch Control." A divisional application Ser. No. 08/455,820, filed May 31, 1995, (now U.S. Pat. No. 5,625,808 issued Apr. 29, 1997) entitled "Read Only Store as Part of Cache Store for Storing Frequently Used Millicode Instructions." PA1 application Ser. No. 08/414,977, filed Mar. 31, 1995, (now. U.S. Pat. No. 5,673,391 issued Sep. 30, 1997). entitled "Hardware Retry Trap for Millicoded Processor" PA1 application Ser. No. 08/414,158, filed Mar. 31, 1995, (now U.S. Pat. No. 5,680,598 issued Oct. 21, 1997) entitled "Addressing Extended Memory Using Millicode." PA1 application Ser. No. 08/414,812, filed Mar. 31, 1995, entitled "Mapping Processor State Into A Millicode Addressable Processor State Register Array" was abandoned in favor of a File Wrapper Continuation application Ser. No. 08/892,068 filed Jul. 14, 1997, same title, now U.S. Pat. No. 5,802,359 issued Sep. 1, 1998. PA1 application Ser. No. 08/414,164, filed Mar. 31, 1995, (now U.S. Pat. No. 5,713,035 issued Jan. 27, 1998) entitled "Linking Program Access Register Number With Millicode Operand Access." PA1 application Ser. No. 08/414,975, filed Mar. 31, 1995, (now U.S. Pat. No. 5,694,617 issued Dec. 2, 1997) entitled "Priority and Recovery Method For System Serialization (Quiesce)."
However, we discovered that there could still be improvements in the millicode environment. We will use in a new machine a new modality for millicode implementation which employs a group of new millicode improvements. Here a current group of improvements will be described together for understanding their relationship together.
To illustrate, we have obtained improvement in a processor implementing International Business Machines Corporation ESA/390 architecture and using millicode as internal code for complex operations, by providing a new and an efficient means to answer the need to set and test various conditions. In this area in the past various forms of status bits and condition codes have been used in internal code. The millicode copy of the ESA/390 condition code and the corresponding Branch on Condition instructions are limited by the 2-bit format and by the broadly general usage, which prevents the condition code from holding information across many instructions. Bits of millicode registers can be defined to record various conditions, and can be connected to branch points, but this creates a problem in a pipelined processor which limits their usefulness in performance-sensitive millicode routines. Alternatively, status bits in millicode registers can be explicitly tested via the millicode condition code and Branch on Condition instructions, but this requires multiple instructions, which can impact performance in certain routines. This scheme also requires separate millicode instructions to manipulate the flags in response to the detection of various conditions, again impacting performance in some cases. For our preferred embodiment in the new modality for satisfying the need to set and test various conditions, refer to the section below entitled "Millicode Flags with Specialized Update and Branch Instructions."
For another improvement we have focused on the ESA/390 instructions Edit and Edit and Mark which process a string of characters and decimal digits using a second string as a pattern Generally, it has been recognized that in an ESA/390 implementation which uses millicode as its internal code, handling all of the cases and states defined by ESA/390 requires a significant number of CP cycles. This impacts the performance of these ESA/390 instructions, and thus of programs which make use of these instructions. Past S/390 (and its predecessors) CPUs have used a variety of algorithms to execute the Edit (ED) and Edit and Mark (EDMK) instruction. For the most part these have used internal code, and in some cases special internal code instructions have been defined to accelerate this function. There has been a need for a new generation machine to improve the performance of these instructions. For the key modality for this purpose, refer to the section below entitled "Specialized Millicode Instruction for Editing Functions".
Now even our recent improvements as illustrated by U.S. Pat. No. 5,694,587, issued Dec. 2, 1997. in the ESA/390 instruction Translate and Test (TRT) requires updates to two General Registers (GRs) and the condition code to reflect the results of the operation. Past S/390 (and its predecessors) CPUs have used a variety of algorithms to execute the Translate and Test (TRT) instruction. For the most part these have used internal code, and in some cases special internal code instructions have been defined to accelerate this function. In an ESA/390 implementation which uses millicode as its internal code, the computation and propagation of these results requires a significant number of CP cycles even after the translation and testing of operand bytes is complete. This impacts the performance of the Translate and Test instruction, and thus of program which make use of this instruction. So in order to improve performance, we have provided a new and specialized instruction described in the section below entitled "Specialized Millicode Instruction for Translate and Test."
Lastly, in this description we have improved on packed decimal division. Past S/390 (and its predecessors) CPUs have used a variety of algorithms to execute the Divide Decimal (DP) instruction. For the most part these have used internal code, and few if any internal code instructions have been defined to accelerate this function. Packed decimal division is a computationally complex operation, particularly when the only packed decimal arithmetic hardware available is an adder (as is commonly the case). Because of this, the internal code sequences required to support the ESA/390 instruction Divide Decimal (DP) can require a large number of cycles. In an application which makes even moderately frequent use of this operation, the time spent executing DP instructions can have a significant negative impact on processor performance. For this purpose please refer to the section below entitled "Specialized Millicode Instructions for Packed Decimal Division".