The invention relates to the field of microprocesors, and, more particularly, to a modular arithmetic coprocessor that performs an integer division.
As described in international Patent Application PCT/FR 97/00035, there already exist known modular arithmetic coprocessors implementing modular exponential operations. This type of operation enables encryption and decryption of binary messages encoded in an RSA type encoding. This type of coprocessor, described, for example, in European Patent Application No. 601,907 can not be used to perform integer divisions. However, the use of an RSA encoding requires the performance of such division operations. It therefore becomes necessary to use a processor to perform these divisions. This involves penalties in terms of computation duration and requires the use of a large-sized program and data memory.
The above referenced international patent application modifies the coprocessor disclosed in the European patent application to perform integer divisions without using a processor. Nonetheless, there is still a need to reduce computation times in performing an integer division.
A method for performing an integer division in a coprocessor divides a first data element M by a second data element D. The first data element M is binary encoded on m words M(mxe2x88x921) . . . M(1), M(0) of L bits. The second data element D is binary encoded on d words D(dxe2x88x921) . . . D(1), D(0) of L bits. The variables m, L and d are integers and d less than m. The result referenced as Q is encoded on q words Q(qxe2x88x921), Q(qxe2x88x922) . . . Q(1), Q(0) of L bits with q=mxe2x88x92d+1 and Q(qxe2x88x921)=00 . . . 00 or 00 . . . 01.
The method includes the steps of taking the complement of the data element M by a most significant word bit M(m) formed by L zeros, and performing a computational iterative loop mxe2x88x92d+1 times. The computational iterative loop includes the steps of producing a first intermediate data element Q(qxe2x88x921xe2x88x92j), with j as an index varying from 0 to qxe2x88x921 and binary encoded on L bits for performing an integer division of the two most significant words of the first data element M by the most significant word of the second data element D a second intermediate data element Z(mxe2x88x92j)binary encoded on 2 *L bits is produced to test and determine whether the first intermediate data element Q(qxe2x88x921xe2x88x92j) is greater than a desired value of one of the words of the result Q, or whether it corresponds to a desired value of one of the following words.
In the former case, the first intermediate data element is decremented by one unit and the previous step is repeated. In the second case, the multiplication of the second data element D by the first intermediate data element Q(qxe2x88x921xe2x88x92j) is performed to form a third intermediate data element B binary encoded on d+1 words of L bits. The d new most significant words M(mxe2x88x92jxe2x88x921) . . . M(mxe2x88x92jxe2x88x92d) of the first data element are generated for performing a subtraction of the third intermediate data element B from the d+1 non-zero most significant bits of the first data element, except during the first iteration where the word M(m) and the d most significant words of the first data element M are considered. If the result of the subtraction is negative, the first intermediate data element is decremented by one unit and the d new most significant words are modified by adding the second data element to these words.
According to one embodiment, the method preferably includes the following steps E1-E6. Step E1 includes the steps of loading the word M(mxe2x88x921) into a first register; the most significant bit of this word being provided as an output of the first register; loading of L zeros corresponding to a word M(m) in a second register; loading L zeros into a third register; loading the word D(dxe2x88x921) into a fourth register; and the least significant bit of this word is provided at an output of the fourth register.
The repetition q times of the following steps E2-E6 are performed, where j is an index varying from 0 to qxe2x88x921. Step E2 is the integer division of the contents of the second and first registers by the contents of the fourth register; and the storage of the L least significant bits of the result, referenced Q(qxe2x88x921xe2x88x92j), in the first register and in the fifth and sixth registers.
Step E3 includes the steps of loading the word D(dxe2x88x922) into the second register; the least significant bit of this word is provided at an output of the second register; loading the words M(mxe2x88x921xe2x88x92j) and M(mxe2x88x92j) in series in the third register; the least significant bits being directed towards the output of this third register; loading of the word M(mxe2x88x922xe2x88x92j) into a seventh register, and the least significant bit of this word being directed towards the output of the seventh register.
Step E4 includes the steps of shifting the contents of the second and fourth registers towards series inputs of a first multiplication circuit and a second multiplication circuit, and the contents of the third and seventh registers, with looping of the outputs of these registers to their inputs; multiplying D(dxe2x88x922) by Q(qxe2x88x921xe2x88x92j) in the first multiplication circuit; multiplying D(dxe2x88x921) by Q(qxe2x88x921xe2x88x92j) in the second multiplication circuit; subtracting M(mxe2x88x92j)M(mxe2x88x92jxe2x88x921) from D(dxe2x88x921)*Q(qxe2x88x921xe2x88x92j) in a first subtraction circuit; subtracting M(mxe2x88x922xe2x88x92j) from the data element produced by the first subtraction circuit in a second subtraction circuit; the word M(mxe2x88x922xe2x88x92j) is delayed in a delay cell; subtracting the data element produced by the second subtraction circuit from D(dxe2x88x922)*Q(qxe2x88x921xe2x88x92j) in a third subtraction circuit; and testing the result of the last subtraction in a test circuit.
Step E5 includes the steps of determining that if the result of the last subtraction of the step E4 is positive, then the contents of the first register are shifted; subtracting an L bit data element formed by a least significant bit at one and of Lxe2x88x921 most significant bit at zero from the data element Q(qxe2x88x921xe2x88x92j) initially present in the first register; subtracting being done in a fourth subtraction circuit, and storing the data element produced by the fourth subtraction circuit in the first register; and step E4 is repeated.
Step E6 includes the steps of determining the result of the last subtraction of step E4 is negative, then the words M(mxe2x88x92j) . . . M(mxe2x88x92dxe2x88x92j) are loaded in the third register and the data element D is loaded in the fourth register; shifting the contents of the third and fourth registers with the looping of the output of the fourth register to its input so as to keep D(dxe2x88x921) in this register; multiplying the data element D by the data element Q(qxe2x88x921xe2x88x92j) in the second multiplication circuit; subtracting the result produced by the second multiplication circuit from the bits given by the third register in the first subtraction circuit, and the testing of the result.
Step E6 further includes the steps of determining that if the result given by the first subtraction circuit is negative, then the data element D given by the fourth register is added to the result produced by the first subtraction circuit in an addition circuit; subtracting an L bit data element formed by a least significant bit at one and by Lxe2x88x921 most significant bits at zero from the data element Q(qxe2x88x921xe2x88x92j) initially present in the first register, the subtraction being done in the fourth subtraction circuit; and storing the data element produced by the fourth subtraction circuit in the first register; storing the last two words of the result produced by the first subtraction circuit or, as the case may be, by the addition circuit, referenced M(mxe2x88x921xe2x88x92j) . . . M(mxe2x88x92dxe2x88x92j), in the second and first registers, the data elements Q(qxe2x88x921xe2x88x92j) and M(mxe2x88x923xe2x88x92j) . . . M(mxe2x88x92dxe2x88x92j) being output external the coprocessor for storage and the data elements M(mxe2x88x923xe2x88x92j) . . . M(mxe2x88x92dxe2x88x92j) replacing the data elements M(mxe2x88x923xe2x88x92j) . . . M(mxe2x88x92dxe2x88x92j) having corresponding place values in the data element M, the least significant bits of the words M(mxe2x88x921xe2x88x92j) and M(mxe2x88x922xe2x88x92j) being directed towards the outputs of the second and fourth registers; and loading L zeros into the third register.
According to one embodiment, the steps E1, E3 and E6 are modified as follows. Step E1 now includes the steps of loading the data element M and L zeros in a third register; loading the word M(mxe2x88x921) in a first register, the most significant bit of this word provided an output of this first register; loading L zeros corresponding to a word M(m) in a second register; loading the data element D in a fourth register, the least significant bit of the word D(Dxe2x88x921) provided at an output of this fourth register.
Step E3 now includes the steps of loading the word D(dxe2x88x922) into the second register, the least significant bit of this word provided at an output of the second register; loading the word M(mxe2x88x922xe2x88x92j) into a seventh register, the least significant bit of this word being directed towards the output of this register.
Step E6 now includes the steps of determining that if the result of the last subtraction of the step E4 is negative, then the contents of the third and fourth registers are shifted; multiplying the data element D by the data element Q(qxe2x88x921xe2x88x92j) in the second multiplication circuit; subtracting the result produced by the second multiplication circuit from the bits given by the third register into the first subtraction register.
Step E6 further includes the steps of determining that if the result produced by the first subtraction circuit is negative, then the data element D, given by the fourth register, is added to the result produced by the first subtraction circuit in the addition circuit; and subtracting a data element of L bits formed by one least significant bit at one and by Lxe2x88x921 most significant bits at zero from the data element Q(qxe2x88x921xe2x88x92j) initially present in the first register, the subtraction being done in the fourth subtraction register; and storing the result of this last subtraction in the first register; storing the last two words of the result produced by the first subtraction circuit, or as the case may be, by the addition circuit, referenced M(mxe2x88x921xe2x88x92j) . . . M(mxe2x88x92dxe2x88x92j), in the third, second and first registers; storing the data elements M(mxe2x88x923xe2x88x92j) . . . M(mxe2x88x92dxe2x88x92j) in the third register and replacing the data elements M(mxe2x88x923xe2x88x92j) . . . M(mxe2x88x92dxe2x88x92j) having corresponding place values in the data element M, the least significant bits of the words M(mxe2x88x921xe2x88x92j) and M(mxe2x88x922xe2x88x92j) being directed towards the outputs of the second and first registers; and loading L zeros into the third register.
According to one embodiment, the step E2 comprises the following sub-steps. Step E21 includes the steps of resetting the test circuit, the third subtraction circuit and a fifth subtraction circuit.
Step E22 includes the steps of shifting by one bit the third and fourth registers, the third register recovering the most significant bit of the second register as a most significant bit, the fourth register having its output connected to its input; subtracting in the third subtraction circuit between the most significant bit of the second register and the least significant bit of the fourth register, the fifth subtraction circuit carrying out a subtraction between the least significant bit of the third register and zero.
Step E23 includes the steps of shifting by 2*Lxe2x88x921 bits the third and fourth registers, the output of the fourth register being connected to its input, the input of the third register being connected to the output of the fifth subtraction circuit; subtracting between the 2*Lxe2x88x921 bits output from the third register and zero in the fifth subtraction circuit; and subtracting between the result output from the fifth subtraction circuit and the bits of the fourth register.
Step E24 includes the steps of storing the last carry value of the third subtraction circuit in the test circuit and providing at the output of this test circuit a binary data element equal to the complement of the stored carry value.
Step E25 includes the steps of shifting by one bit the first and second registers, the most significant bit of the first register being loaded as the least significant bit of the second register; and loading the binary data element stored in the test circuit as a least significant bit in the first register.
Step E26 includes the steps of resetting the third and fifth subtraction circuits. Step E27 includes the steps of repeating 2*Lxe2x88x921 times the steps E2 to E6, the fifth subtraction circuit performing a subtraction between the contents of the third register and zero if the previously stored carry value is equal to one; or the fifth subtraction circuit performs a subtraction between the contents of the third register and the contents of the fourth register if the previously stored carry value is equal to zero.