1. Field of the Invention
This invention relates to the field of microprocessors and, more particularly, to optimization of the instruction set of a microprocessor.
2. Description of the Relevant Art
Microprocessor architectures may generally be classified as either complex instruction set computing (CISC) architectures or reduced instruction set computing (RISC) architectures. CISC architectures specify an instruction set comprising high level, relatively complex instructions. Often, microprocessors implementing CISC architectures decompose the complex instructions into multiple simpler operations which may be more readily implemented in hardware. Microcoded routines stored in an on-chip read-only memory (ROM) have been successfully employed for providing the decomposed operations corresponding to an instruction. More recently, hardware decoders which separate the complex instructions into simpler operations have been adopted by certain CISC microprocessor designers. The x86 microprocessor architecture is an example of a CISC architecture.
Conversely, RISC architectures specify an instruction set comprising low level, relatively simple instructions. Typically, each instruction within the instruction set is directly implemented in hardware. Complexities associated with the CISC approach are removed, allowing for more advanced implementations to be designed. Additionally, high frequency designs may be achieved more easily since the hardware employed to execute the instructions is simpler. An exemplary RISC architecture is the MIPS RISC architecture.
Although not necessarily a defining feature, variable-length instruction sets have often been associated with CISC architectures while fixed-length instruction sets have been associated with RISC architectures. Variable-length instruction sets use dissimilar numbers of bits to encode the various instructions within the set as well as to specify addressing modes for the instructions, etc. Generally speaking, variable-length instruction sets attempt to pack instruction information as efficiently as possible into the byte or bytes representing each instruction. Conversely, fixed-length instruction sets employ the same number of bits for each instruction (the number of bits is typically a multiple of eight such that each instruction fully occupies a fixed number of bytes). Typically, a small number of instruction formats comprising fixed fields of information are defined. Decoding each instruction is thereby simplified to routing bits corresponding to each fixed field to logic designed to decode that field.
Because each instruction in a fixed-length instruction set comprises a fixed number of bytes, locating instructions is simplified as well. The location of numerous instructions subsequent to a particular instruction is implied by the location of the particular instruction (i.e. as fixed offsets from the location of the particular instruction). Conversely, locating a second variable-length instruction requires locating the end of the first variable-length instruction; locating a third variable-length instruction requires locating the end of the second variable-length instruction, etc. Still further, variable-length instructions lack the fixed field structure of fixed-length instructions. Decoding is further complicated by the lack of fixed fields.
Unfortunately, RISC architectures employing fixed-length instruction sets suffer from problems not generally applicable to CISC architectures employing variable-length instruction sets. Because each instruction is fixed length, certain of the simplest instructions may effectively waste memory by occupying bytes which do not convey information concerning the instruction. For example, fields which are specified as xe2x80x9cdon""t carexe2x80x9d fields for a particular instruction or instructions in many fixed-length instruction sets waste memory. In contrast, variable-length instruction sets pack the instruction information into a minimal number of bytes.
Still further, since RISC architectures do not include the more complex instructions employed by CISC architectures, the number of instructions employed in a program coded with RISC instructions may be larger than the number of instructions employed in the same program coded in with CISC instructions. Each of the more complex instructions coded in the CISC version of the program is replaced by multiple instructions in the RISC version of the program. Therefore, the CISC version of a program often occupies significantly less memory than the RISC version of the program. Correspondingly, more bandwidth between devices storing the program, memory, and the microprocessor is needed for the RISC version of the program than for the CISC version of the program.
The problems outlined above are in large part solved by a microprocessor in accordance with the present invention. The microprocessor is configured to fetch a compressed instruction set which comprises a subset of a corresponding non-compressed instruction set. The non-compressed instruction set may be a RISC instruction set, such that the microprocessor may enjoy the high frequency operation and simpler execution resources typically associated with RISC architectures. Fetching the compressed instructions from memory and decompressing them within the microprocessor advantageously decreases the memory bandwidth required to achieve a given level of performance (e.g. instructions executed per second). Still further, the amount of memory occupied by the compressed instructions may be comparatively less than the corresponding non-compressed instructions may occupy.
The exemplary compressed instruction set described herein is a variable length instruction set. According to one embodiment, two distinct instruction lengths are included: 16-bit and 32-bit instructions. The 32-bit instructions are coded using an extend opcode, which indicates that the instruction being fetched is an extended (e.g. 32 bit) instruction. Instructions may be fetched as 16-bit quantities. When a 16-bit instruction having the extend opcode is fetched, the succeeding 16-bit instruction is concatenated with the instruction having the extend opcode to form a 32-bit extended instruction. Extended instructions have enhanced capabilities with respect to non-extended instructions, further enhancing the flexibility and power of the compressed instruction set. Routines which employ the capabilities included in the extended instructions may thereby be coded using compressed instructions.
The compressed instruction set further includes multiple sets of register mappings from the compressed register fields to the decompressed register fields. Each value coded in the compressed register fields decompresses to a different register within the microprocessor. In one embodiment, the compressed register fields comprise three bits each. Therefore, eight registers are accessible to a particular instruction. In order to offer access to additional registers for certain select instructions, the select instructions are assigned two opcode encodings. One of the opcode encodings indicates a first mapping of register fields, while the second opcode encoding indicates a second mapping of register fields. Advantageously, the compressed register fields may include relatively few bits while select instructions for which access to additional registers is desired may be granted such access. Additionally, the register mappings are selected to minimize the logic employed to decompress register fields. In one embodiment, the compressed register field is directly copied into a portion of the decompressed register field while the remaining portion of the decompressed register field is created using a small number of logic gates.
The microprocessor supports programs having routines coded in compressed instructions and other routines coded in non-compressed instructions. The subroutine call instruction within the compressed instruction set includes a compression mode which indicates whether or not the target routine is coded in compressed instructions. The compression mode specified by the subroutine call instruction is captured by the microprocessor as the compression mode for the routine. In one embodiment, the compression mode is stored as one of the fetch address bits (stored in a program counter register within the microprocessor). Since the compression mode is part of the fetch address and the subroutine call instruction includes storing a return address for the subroutine, the compression mode of the calling routine is automatically stored upon execution of a subroutine call instruction. When a subroutine return instruction is executed, the compression mode of the calling routine is thereby automatically restored.
An additional feature of one embodiment of the microprocessor is the decompression of the immediate field used for load/store instructions having the global pointer register as a base register. The immediate field is decompressed into a decompressed immediate field for which the most significant bit is set. A subrange of addresses at the lower boundary of the global variable address space is thereby allocated for global variables of compressed instructions. Non-compressed instructions may store global variables in the remainder of the global variable address space. Advantageously, global variable allocation between the compressed and non-compressed routines of a particular program may be relatively simple since the subranges are separate.
Broadly speaking, the present invention contemplates an instruction decompressor configured to decompress compressed instructions. A first one of the compressed instructions is codable to access a first subset of registers defined for a corresponding non-compressed instruction set. Additionally, a second one of the compressed instructions is codable to access the first subset of registers and is further codable to access a second subset of registers.
The present invention further contemplates a method for decompressing compressed instructions. A particular compressed instruction having a first register field is decompressed using a first register mapping from compressed register indicators to decompressed register indicators if the particular compressed instruction is encoded using a first opcode. Alternatively, the particular compressed instruction having the first register field is decompressed using a second register mapping from compressed register indicators to decompressed register indicators if the particular compressed instruction is encoded using a second opcode.
The present invention still further contemplates an apparatus for decompressing compressed instructions comprising a decompressing means. The decompressing means is configured to decompress a particular compressed instruction having a first register field using a first register mapping from compressed register indicators to decompressed register indicators if the particular compressed instruction is encoded using a first opcode. Additionally, the decompressing means is configured to decompress the particular compressed instruction using a second register mapping from compressed register indicators to decompressed register indicators if the particular compressed instruction is encoded using a second opcode.
The present invention yet further contemplates an instruction decompressor configured to decompress a compressed register field of a compressed instruction into a decompressed register field of a decompressed instruction. A decompression of the compressed register field is dependent upon a first value coded into the compressed register field and a second value coded into an opcode field of the compressed instruction.
The present invention additionally contemplates a method for decompressing a compressed register field of a compressed instruction into a decompressed register field of a decompressed instruction. At least a portion of the compressed register field is directly copied into a portion of the decompressed register field. The remaining portion of the decompressed register field is produced by logically operating upon the compressed register field.
Moreover, the present invention contemplates an apparatus for decompressing a compressed register field of a compressed instruction into a decompressed register field of a decompressed instruction, comprising a first means and a second means. The first means is for directly copying at least a portion of the compressed register field into a portion of the decompressed register field. The first means is coupled to receive the compressed register field. Similarly coupled to receive the compressed register field, the second means is for logically operating upon the compressed register field to produce a remaining portion of the decompressed register field.
Furthermore, the present invention contemplates an instruction decompressor configured to decompress a compressed register field of a compressed instruction into a decompressed register field of a decompressed instruction. The instruction decompressor forms a first portion of the decompressed register field by copying at least a portion of the compressed register field thereto. Additionally, the instruction decompressor includes a logic block which is configured to operate upon the compressed register field to produce a remaining portion of the decompressed register field.