1. FIELD OF THE INVENTION
The invention relates generally to digital data processing systems and, more particularly, to methods and apparatuses for summing selected bits from a plurality of ordered machine vectors.
2. DESCRIPTION OF THE RELATED ART
A processor or other digital data processing system may employ a variety of methods and apparatuses to execute instructions more efficiently. One method that can increase the overall instruction throughput is parallel processing. Parallel processing entails executing instructions concurrently. A method that can facilitate parallel processing is the decoding of macro-instructions into micro-operations (xe2x80x9cxcexcopsxe2x80x9d). The xcexcops execute on simpler execution units.
Processor hardware manipulates xcexcops as machine vectors of binary logic signals or bits. Processing of sops may include performing arithmetic operations on bits of the machine vectors. If these arithmetic operations are slow, the overall instruction throughput may decrease.
In parallel processing, one common operation is the transfer of a group of machine as vectors to a buffer having multiple storage addresses. The group may be a mixture of machine vectors for valid xcexcops and machine vectors for nonsense, e.g., machine vectors for xcexcops processed in previous cycles. In such transfers, each machine vector may carry a valid bit to indicate whether the machine vector represents a valid xcexcop. A first value of the valid bit, ie., logic one, indicates a machine vector for a valid xcexcop and a second value of the valid bit, i.e., logic zero, indicates a machine vector not related to a valid xcexcop. Herein, a logic signal is defined to be the digital signal associated with transmitting one bit of a machine vector.
One procedure to transfer a group of machine vectors writes the machine vectors of the group for valid xcexcops to a circular receiving buffer in parallel while disregarding other machine vectors of the group. The procedure includes calculating a storage address in the receiving buffer for each machine vector to be written. To calculate a storage address for a machine vector for a valid xcexcop, the valid bits of earlier machine vectors of the group are summed. Then, a sequential storage address is assigned to the machine vector for the valid xcexcop. The sequential address is the numerical sum of the last address for the previous group of machine vectors written to the buffer plus the sum of the valid bits of the earlier machine vectors of the present group. This procedure writes machine vectors of valid xcexcops to the buffer without leaving unused buffer addresses between successive machine vectors and enables parallel writes of groups of machine vectors.
FIG. 1 illustrates a prior art adder that sums selected bits of an incoming group of ordered machine vectors by generating several partial sums of the selected bits and serially adding the partial sums. The selected bits of an incoming group of machine vectors, enter the adder from lines 101, 102, 103 and 104. The lines 101 and 103 may carry selected bits from several machine vectors. In the adder 100, the selected bits on each line 101, 102, 103 and 104 correspond to the valid bits of the xcexcops produced by decoding one macro-instruction. The lines 101, 102, 103 and 104 connect to ordered inputs of a correction circuit 110. The correction circuit 110 receives a correction vector signal from lines 115. The correction circuit 110 corrects the selected bits received from the lines 101, 102, 103 and 104 and transmits xe2x80x9ccorrectedxe2x80x9d selected bits to lines 105, 106, 107 and 108 in response to receipt of the correction vector. For example, the correction circuit 110 may reset a portion of the valid bits to logic zero, i.e., corresponding to xe2x80x9cinvalidxe2x80x9d xcexcops, in response to an exception that requires the corresponding xcexcops to be flushed.
Still referring to FIG. 1, an adder 120 generates a first sum of the corrected selected bits from the lines 105 and transmits the first sum to a line 121. The digital signal on the line 121 sums the selected bits on lines 101 as modified by the correction vector. In the illustrated adder 100, the lines 102, 106, 104 and 108 transmit one bit. The adder 124 adds the corrected selected bit from the line 106 to the first sum from the line 121 and transmits the resulting sum to a line 125. The line 125 transmits a sum of the selected bits from the input lines 101 and 102 as modified by the correction vector. The adder 122 sums the corrected selected bits from the lines 107 and transmits the resulting sum to a line 123. An adder 126 sums the earlier sums from the line 123 and the line 125 and transmits the resulting sum to a line 127. The digital signal on the line 127 sums the selected bits from the input lines 101, 102 and 103 as corrected by the correction vector. The decoded number generators 128 and 129 convert the sums from the lines 127 and 121, respectively, to decoded numbers.
Referring again to FIG. 1, the adder 100 produces decoded number signals on a line 131 and on a line 130 for two respective partial sums. The first partial sum adds the selected bits from the lines 101, and the second partial sum adds the selected bits from the lines 101, 102 and 103. Before forming sums, the selected bits are corrected by the correction circuit 110. Finally, the partial sums in decoded form may be used in circuits for selecting addresses (not shown).
Decoded numbers are binary numbers with zeroes for all digits except one. For example, the binary numbers 0011 and 0101 (decimal numbers 3 and 5, respectively) are written as the respective xe2x80x9csix-digitxe2x80x9d decoded numbers 001000 and 100000, respectively. Decoded numbers can have different total numbers of digits in different hardware devices. Decoded numbers may be used for selecting one of a plurality of addresses. The number of possible addresses is the total number of digits in the decoded number.
The conventional adder 100 of FIG. 1 may reduce the speed for instruction processing for several reasons. First, the adders 120, 122, 124 and 126 sum non-decoded numbers, and summing non-decoded numbers may involve more complex and/or slower circuitry. Second, the correction circuit 110 may introduce delays, because the correction circuit 110 waits for the arrival of the correction vector from the line 115 before processing the group of selected bits from the lines 101, 102, 103 and 104. If the group of selected bits is available early, waiting to receive the correction vector may delay the processing of the selected bits and of the corresponding ,xcexcops.
The present invention is directed to overcoming, or at least reducing the effects of, one or more of the problems set forth above.
In a first aspect, the invention provides an apparatus for adding selected bits. The apparatus includes a hardware device having a plurality of ordered input terminals to receive binary signals for a portion of an ordered set of the selected bits. The hardware device also has a plurality of output terminals to transmit digital signals for a plurality of sums. Each sum adds a set of speculative values of a portion of the selected bits.
In a second aspect, the invention provides a method for adding a set of ordered selected logic signals. The method includes producing a set of digital signals for a plurality of sums and selecting one of the digital signals for a sum in response to receiving a signal for a correction vector. Each sum adds a set of speculative values for an ordered set of selected logic signals. The selected sum is equal to a sum of speculative values of the selected logic signals as identified by the correction vector. The method also includes transmitting the selected one of the digital signals to an output terminal.