Danish patent application PA 1998 01743 entitled xe2x80x98Sum-intervaldetektorxe2x80x99 filed Dec. 29, 1998.
Not Applicable
Not Applicable
The sum interval detector is a general arithmetic circuit that may be used to avoid conflicting memory accesses in superscalar processors. Scalar processors execute instructions in program order and the associated memory accesses will execute in the same order and no conflicts can exist between these accesses. In contrast hereto, superscalar processors may execute instructions out of order although the semantics of program order must be maintained which require many checks for dependencies both among register references and among memory references. In most cases, memory accesses may be reordered if they do not deal with the same data; but since an access is given not only by its effective address but also its size in bytes it becomes a bit more complex to detect whether two memory accesses overlap and thus may not be reordered. If all memory accesses are aligned, i.e. every effective address is a multiple of the data size, which in turn is a power of two (2N), then the comparison will be a simple identity comparison that excludes the lower N bits of the effective addresses. If no alignment restriction exists no reordering may take place if the absolute difference between the two effective addresses is less than the data size.
Since effective memory addresses normally are the sum of a base address and an offset the straightforward approach would be to add the base address and offset for each effective address and then, in case of alignment restrictions, compare the effective addresses excluding the lower N bits or, without restrictions on alignment, subtract the two sums from each other and check whether the absolute difference is less than the data size, i.e. the difference lies in the interval-size . . . size. In particular in the last case this is a slow process.
The special case of determining whether two effective addresses as defined above are identical corresponds to the degenerate interval 0 . . . 0, i.e. the single value 0. This case has been solved in expired U.S. Pat No. 5,144,577 entitled xe2x80x98Two-sum comparatorxe2x80x99.
The present invention in part relies on a known method to detect whether a sum is equal to a constant described in xe2x80x98Comments on xe2x80x9cEvaluation of A+B=K Conditions Without Carry Propagationxe2x80x9dxe2x80x99 by Behrooz Parhami (IEEE Transactions on Computers, Vol. 43, No. 4, April 1994), where the goal is reduction of the negative effect of conditional jumps in pipelined processor architectures.
The sum interval detector is a novel way of detecting whether the sum of two n-bit inputs and a carry input computed modulo 2n is within the interval xe2x88x922p . . . 2pxe2x88x921 without having to calculate the sum explicitly which takes both additional time and additional hardware. The word width n and the power p determining the interval may vary from implementation to implementation. The interval may be modified by only including negative values xe2x88x922p . . . xe2x88x921 or non-negative values 0 . . . 2pxe2x88x921. Further the interval may be expanded or limited by including or excluding, respectively, specific values. In particular the interval may be limited by excluding the value xe2x88x922p thereby producing the symmetric interval xe2x88x922p+1 . . . 2pxe2x88x921, or including the value 2p thereby producing the symmetric interval xe2x88x922p . . . 2p.
Detection of whether a sum A+B+C0 computed modulo 2n belongs to the interval xe2x88x922p . . . 2pxe2x88x921 may be split into detection of whether the sum belongs to the subinterval xe2x88x922p . . . xe2x88x921 or 0 . . . 2pxe2x88x921. The first subinterval corresponds to the binary values 111 . . . 1XXX . . . X, while the second subinterval corresponds to the binary values 000 . . . 0XXX . . . X, where n is the word width and p is the number of don""t cares (X).
The invention utilizes a known method to detect whether a sum is equal to a constant to detect whether the upper nxe2x88x92p bit of the sum is binary 000 . . . 0 or 111 . . . 1, i.e. 0 or xe2x88x921, respectively, while the lower p bits of the sum are ignored corresponding to XXX . . . X. This requires that the carry Cp be known, which occurs with a known p-bit carry look-ahead circuit CLA, which may calculate Cp as follows:
Cp=(C0 AND P0 AND . . . Ppxe2x88x921) OR (G0 AND P1 AND . . . Ppxe2x88x921) OR . . . (Gpxe2x88x922 AND Ppxe2x88x921) OR Gpxe2x88x921,
where
Gi=Ai AND Bi and Pi=Ai OR Bi.
The known method to detect whether a sum is equal to a constant may briefly be described as
A+B=K
A+B+Kcomplement=2nxe2x88x921
S+C=2nxe2x88x921
Si=Cicomplement, for i=0 . . . nxe2x88x921,
which corresponds to a reduction of A+B+Kcomplement to S+C (sum and carry) using carry-save addition with a number of full-adders followed by a check of whether the sum bit and the carry bit (from the previous bit position) are different in all bit positions. The carry C0 is 0 in the known method but is used as a carry input in the following.