1. Field of the Invention:
The invention relates generally to computer systems, and more particularly to a processor employing hardware for expediting bit scan instructions.
2. Description of Related Art
Without limiting the scope of the invention, this background information is provided in the context of a specific problem to which the invention has application.
In the x86 microprocessor architecture, there exists a Bit Scan Forward (BSF) and a Bit Scan Reverse (BSR) instruction. The BSF instruction, which has a syntax of BSF (destination register, source operand), scans the source operand starting with the least significant bit (bit 0) and continues towards the most significant bit to identify the first non-zero bit position. If all bits in the source operand are zero, a zero (ZF) flag is set and an undefined value resides in the destination register. Otherwise, the ZF flag is cleared and the destination register is loaded with an index which reflects the first bit position holding a logical one.
Likewise, the BSR instruction scans the source operand starting with the most significant bit (bit 31 for a double-word and bit 15 for a word) and continues towards the least significant bit (bit 0) to identify the first non-zero bit position. Similarly, the zero (ZF) flag is set if all bits in the source operand are zero. Otherwise, the ZF flag is cleared and the destination resister is loaded with an index which reflects the first bit position holding a logical one. An exemplary, but not exclusive, application for the BSF and BSR instructions is for parsing data tables.
Heretofore, the BSF and BSR instructions have employed a technique which first tests the source operand for all zeros and sets the ZF flag accordingly. If the source operand is not all zeros, then the leftmost or rightmost bit of the source operand (depending if it is a BSF or BSR instruction) is tested for a logical one and if no match is found, it is serially shifted (either left or right), and a destination counter is recursively incremented until a logical one is encountered.
These operations are better understood with reference to the flow diagrams in FIG. 7 and the following snippets of pseudo code in Table 1, which describe the prior art BSF and BSR instructions. Each recursive iteration in the xe2x80x9cWHILExe2x80x9d loops take at least one clock cycle. Thus, it can be seen that a bit scan instruction for a double word can take in excess of thirty-two clock cycles. Moreover, the number of clock cycles to complete a BSF or BSR instruction varies radically (typically between 6-72 clock cycles) depending on the number of trailing or leading zeros, making timing unpredictable. This of course, impacts the performance of programs making extensive use of the BSF and BSR instructions.
From the foregoing, it can be seen that there is a need to reduce and standardize the number of clock counts for the BSF and BSR operations regardless of the position of the leading and trailing zeros in the source operand.
To overcome the limitations in the prior art described above, and to overcome other limitations that will become apparent upon reading and understanding the present specification, the present invention discloses a processor employing an execution unit having leading/trailing zero detectors organized along predetermined boundaries for detecting the first non-zero bit position in a source operand to expedite bit scan forward and reverse instructions. In one clock cycle, all bits in the source operand are concurrently tested to identify the first non-zero bit and its position is written into a destination index register.
A feature of the present invention is the completion of a bit scan on a source operand in a fixed number of clock cycles regardless of the number of leading or trailing zeros.
Another feature of the present invention is symmetrical execution time for either the BSF or BSR instruction.
These and various other objects, features, and advantages of novelty which characterize the invention are pointed out with particularity in the claims annexed hereto and forming a part hereof. However, for a better understanding of the invention, its advantages, and the objects obtained by its use, reference should be made to the drawings which form a further part hereof, and to the accompanying descriptive matter, in which there is illustrated and described specific examples of systems and methods in accordance with the present invention.