1. Field of the Invention
The present invention relates to the design of processors within computer systems. More specifically, the present invention relates to a method and apparatus for efficiently emulating sub-instructions in a very long instruction word (VLIW) processor.
2. Related Art
In order to increase computational performance, processor designs are beginning to move toward very long instruction word (VLIW) architectures in which multiple functional units simultaneously execute a single VLIW instruction. A VLIW instruction is typically composed of a plurality of xe2x80x9csub-instructionsxe2x80x9d that specify operations for individual functional units.
One problem for VLIW architectures is handling exception conditions that arise when a sub-instruction is not implemented within a hardware functional unit and must instead be emulated in software, or when a set of data inputs causes a hardware functional unit to generate an exception, such as a divide by zero condition or an overflow condition. In current VLIW architectures, even if only a single sub-instruction in a VLIW instruction generates an exception condition, all of the sub-instructions that make up the VLIW instruction must be emulated in software. This can seriously degrade computer system performance.
Furthermore, even if only a few sub-instructions generate exception conditions, the computer system must provide code to emulate all possible sub-instructions; this includes providing code for emulating sub-instructions that are already implemented in hardware. Writing code for all of these sub-instructions causes a number of problems. First, it is expensive and time-consuming to write instructions for sub-instructions that are already implemented in hardware. Second, ensuring correctness of emulation becomes a bigger problem. It is hard to ensure that even the small number of sub-instructions that are not implemented in hardware are emulated correctly in software. It is harder still to ensure that all sub-instructions, including the ones already implemented in hardware, are emulated correctly. Furthermore, providing additional routines to emulate sub-instructions uses more computer memory, which can degrade cache performance and can cause more page faults.
What is needed in a method and apparatus that eliminates the need for all of the sub-instructions in a VLIW instruction to be emulated in software when only a small number of sub-instructions from the VLIW instruction actually require emulation in software, and an efficient way to deal with exception conditions, such as an overflow.
One embodiment of the present invention provides a system that efficiently emulates sub-instructions in a very long instruction word (VLIW) processor. The system operates by receiving an exception condition during execution of a VLIW instruction within a VLIW program. This exception condition indicates that at least one sub-instruction within the VLIW instruction requires emulation in software or software assistance. In processing this exception condition, the system emulates the sub-instructions that require emulation in software and stores the results. The system also selectively executes in hardware any remaining sub-instructions in the VLIW instruction that do not require emulation in software. The system finally combines the results from the sub-instructions emulated in software with the results from the remaining sub-instructions executed in hardware, and resumes execution of the VLIW program.
According to one aspect of the present invention, the emulation process includes: saving state from a plurality of registers within the VLIW processor; placing the VLIW processor into a privileged mode; and activating a trap handler to perform the emulation. Activating the trap handler may include reading an exception register that indicates which of the sub-instructions caused the exception condition, and then emulating the sub-instructions that caused the exception condition in accordance with a priority ordering.
According to one aspect of the present invention, the act of selectively executing in hardware the remaining sub-instructions that do not have to be emulated includes selectively enabling hardware functional units to execute the remaining sub-instructions. This may be accomplished by storing a pattern of enablement signals into an enablement register, wherein each bit of the enablement register indicates whether a corresponding hardware functional unit for corresponding sub-instruction is to be enabled. This pattern of enablement signals is applied to hardware functional units in the VLIW processor so that the hardware functional units execute only the enabled sub-instructions. Next, the VLIW instruction is executed so that only the remaining sub-instructions, which have not been emulated in software, are executed in hardware. After the VLIW instruction is executed, a trap is generated to in order to complete processing of the exception condition.