1. Field of the Invention
The present invention relates to Single Instruction Multiple Data (SIMD) computer operations, and more specifically to floating-point exception handling for SIMD operations.
2. Relevant Background
As general-purpose computer processors become ever more powerful, there is an increasing demand for the capability of high speed graphics calculations. Fueling this demand is the growth of such applications as video conferencing, 3-D modeling, computer animation, electronic publishing, and virtual reality. Processors that can provide high-speed graphical support for two and three dimensional imaging, video and audio processing, and image compression have a competitive edge as high-volume applications emerge in the informational and professional markets.
Typically, graphics calculations involve multiple floating-point arithmetic operations such as addition and multiplication. Many computer processors have adopted the IEEE Standard for Binary Floating-Point Arithmetic, ANSI/IEEE Std 754-1985, referred herein as xe2x80x9cIEEE 754xe2x80x9d. Examples of such processors include UltraSPARC systems offered by Sun Microsystems, the PowerPC processor available from Motorola Inc. and International Business Machines Corp., and any of the Pentium or x86 compatible processors available from the Intel Corporation or other corporations such as AMD and Cyrix. The IEEE 754 standard includes specifications for floating-point storage formats, floating-point precision and accuracy ranges, data type conversions, rounding operations, and floating-point exceptions.
There are five types of floating-point exceptions defined by the IEEE 754 standard: inexact, divide-by-zero, underflow, overflow, and invalid. An exception, as used herein, is an error condition often requiring software intervention for the processor to continue executing the current instruction stream. When an exception occurs, a specific exception handler routine associated with the exception is executed. However, it is generally possible to disable, or mask, an exception such that the exception handler is not invoked when the exception occurs. Table 1 contains a summary of each IEEE 754 defined exception.
Processors typically utilize floating-point status registers to flag IEEE 754 exceptions that occur during floating-point calculations. Often, a bank of five bits, corresponding to the five IEEE 754 defined exceptions, is used to flag any IEEE 754 exception which occurs during the currently executing floating-point instruction. In addition, another bank of five bits may be used to mask IEEE 754 exceptions, in which case another set of five bits is typically used to keep track of any accrued IEEE 754 exceptions which were masked.
In an effort to increase the speed of graphics calculations, some processor include a specialized set of program instructions which quickly perform sophisticated floating-point graphics calculations. These specially tailored graphics instructions may execute complex graphics operations that customarily required dozens of clock cycles in as little as one clock cycle, thereby increasing the throughput of graphics based calculations. One category of instructions employed to speed graphics operations is referred to as Single Instruction Multiple Data (SIMD) instructions. For example, SIMD instructions may be designed in accordance with the Visual Instruction Set (VIS(trademark)) developed by Sun Microsystems. VIS(trademark) is a trademark of Sun Microsystems, Inc. in the United States and in other countries. Alternatively, SIMD instructions could be modeled to work with the MMX instruction set designed by Intel Corporation.
SIMD instructions include a main operation code (op-code) and a plurality of sub-operations. The sub-operations are typically floating-point calculations which are executed in parallel. One complexity of executing multiple floating-point instructions in parallel is that each floating-point instruction may generate its own IEEE 754 exception. In general, processors with architecture supporting regular floating-point instructions and SIMD floating-point instructions could require a separate copy of the floating-point status register for each floating-point instruction executed in parallel. Thus, multiple copies of current exception flags, accrued exception flags, and trap enable mask bits may be needed (i.e. one copy for regular floating-point instructions and one copy for each SIMD sub-operation). This is costly in hardware to implement. In addition, any modifications to the floating-point status register configuration is extremely undesirable when dealing with an existing processor, architecture.
What is needed is a mechanism to keep track of IEEE 754 exceptions during execution of SIMD floating-point instructions as well as regular floating-point instructions. This mechanism should use existing configurations of the floating-point status registers, such that processor architecture remains unchanged. The mechanism should not require maintaining multiple copies of floating-point status registers or multiple copies of exception flags for each SIMD sub-operation.
Briefly stated, the present invention involves a method for handling an IEEE 754 standard exception for a SIMD instruction with a plurality of sub-operations. The method includes the operations of determining if a trap enable mask field is configured to mask or. enable the exception; performing a bit-wise logical xe2x80x9cORxe2x80x9d function of corresponding exception flags of the sub-operations with an accrued exception flag field if the trap enable mask field is configured to mask the exception; and updating the accrued exception flag field with a result of the OR function.
Another aspect of the invention is a method for handling IEEE 754 standard exceptions. The method includes providing a floating-point trap type field configured to indicate a floating-point exception cause; a trap enable mask field configured to selectively mask or enable IEEE 754 standard exceptions, where the trap enable mask field includes flags corresponding to each of the IEEE 754 standard exceptions; a current exception field configured to indicate the occurrence of enabled IEEE 754 standard exceptions, where the current exception field includes flags corresponding to each of the IEEE 754 standard exceptions; and an accrued exception field configured to indicate the occurrence of masked IEEE 754 standard exceptions, where the accrued exception field includes flags corresponding to each of the IEEE 754 standard exceptions. The method further includes executing a SIMD instruction comprising a plurality of sub-operations, where the SIMD instruction causes an IEEE 754 exception; determining if the trap enable mask field is configured to mask or enable the exception; performing a bit-wise logical xe2x80x9cORxe2x80x9d function of corresponding exception flags of the sub-operations with the accrued exception field if the trap enable mask field is configured to mask the exception; updating the accumulated exception flag field with a result of the OR function if the trap enable mask field is configured to mask the exception; clearing the accrued exception field and the current exception field if the trap enable mask field is configured to enable the exception; setting an exception flag in the floating-point trap type field if the trap enable mask field is configured to enable the exception; and determining which of the sub-operations generated the exception.
In accordance with another aspect of the invention, the invention is an apparatus suitable for handling an IEEE 754 standard exception for a SIMD instruction with a plurality of sub-operations. The apparatus includes a trap enable mask field configured to selectively mask or enable IEEE 754 standard exceptions, where the trap enable mask field includes flags corresponding to each of the IEEE 754 standard exceptions; an accrued exception field configured to indicate the occurrence of masked IEEE 754 standard exceptions, where the accrued exception field includes flags corresponding to each of the IEEE 754 standard exceptions; xe2x80x9cORxe2x80x9d logic operatively coupled to the accrued exception field to generate a bit-wise logical xe2x80x9cORxe2x80x9d of corresponding exception flags of the sub-operations with the accrued exception field if the trap enable mask field is configured to mask the exception; and a resulting bit pattern from the xe2x80x9cORxe2x80x9d logic, where the resulting bit pattern is written to the accrued exception field if the trap enable mask field is configured to mask the exception.
Still another aspect of the invention is a computer program product embodied in a tangible media suitable for handling an IEEE 754 standard exception for a SIMD instruction including a plurality of sub-operations. The tangible media may include a magnetic disk, an optical disk, a propagating signal, or a random access memory device.