Recently, large scale commercial deployment of smart cards has become commonplace in industrial, retail banking and consumer credit card applications which require affordable, efficient and secure smart card devices and readers. Due to the considerable monetary values and large scale associated with widespread smart card deployments, the success of such applications is dependent on both acceptable deployment costs and transaction security.
Typically, smart cards are manufactured with low end microprocessors having relatively slow speed, short bit lengths and limited cache and memory so as to minimize card production costs. Smart card security features typically include digital signatures, data encryption and public-key operations which require long number arithmetic. Naccache et al. [see D. Naccache and D. M'Raïhi, “Cryptographic smart cards”, IEEE Micro, 16(3):14–24, 1996] have provided an overview of commercial smart cards with cryptographic capabilities, including a discussion of general implementation concerns on various types of smart cards. Naccache and co-workers [see D. Naccache, D. M'Raïhi, W. Wolfowicz, and A. di Porto. “Are crypto-accelerators really inevitable?”, Advances in Cryptography—EUROCRYPT '95, ed. L. Guillou et al., Lecture Notes in Computer Science, vol. 921, Springer-Verlag (New York 1995) pp. 404–409] disclosed an early implementation of a 20-bit, zero-knowledge identification system on a 4 MHz Thomson ST16623 microprocessor. Many current generation commercial smart cards use 8-bit microcontrollers derived from 1970s families such as the Intel 8051 [see Sencer Yeralan and Ashutosh Ahluwalia, Programming and Interfacing the 8051 Microcontroller. Addison-Wesley (Wellesley, Mass. 1995)] and the Motorola 6805.
The use of commonly available public-key algorithms such as RSA or DSA with these low cost microprocessors typically results in unacceptably long processing delays since the algorithms employed are based on modular arithmetic with very long operands. To address this problem, some smart card microcontroller manufacturers include additional on-chip hardware to accelerate long-number arithmetic operations. However, in large volume, cost-sensitive commercial applications it is preferable to execute public-key operations on smart cards having low cost microprocessors without the addition of a coprocessor. Thus, it is both technically and commercially advantageous to implement a public-key digital signature algorithm which does not introduce smart card performance problems nor require additional hardware beyond that of a typical 8-bit or 16-bit microcontroller.
One attractive solution to this smart card computational problem may be provided by the computational efficiency available with finite field or Galois field arithmetic. Finite fields have important application in many areas of modem communication systems. In particular, arithmetic in finite fields is required for the realization of certain public-key or asymmetric schemes and for certain error correction codes. Additional applications which may advantageously employ finite field arithmetic include signal processing, random number generation and smart appliances. In the area of cryptography, finite filed arithmetic is required in Elliptic Curve Discrete Logarithm Cryptography (herein “ECC”) systems and Digital Logarithm (herein “DL”) schemes. Both of these methods are classified as public-key schemes and can be utilized for building communication and computer systems which provide enhanced security functions. For example, both methods may be employed for assuring sender authenticity, maintaining integrity of electronic messages through digital signatures, exchanging keys over insecure channels, or identifying the parties which are communicating. An elliptic curve cryptosystem relies on the assumed hardness of the Elliptic Curve Discrete Logarithm Problem (herein “ECDLP”) for its security. An instance of the ECDLP is posed for an elliptic curve defined over a finite field GF(pm) for “p” a prime and “m” a positive integer. The rule to perform the elliptic curve group operation can be expressed in terms of arithmetic operations in the finite field. Thus the speed of the field arithmetic determines the speed of the cryptosystem.
Implementations of elliptic curve cryptography and digital signature algorithm methods have been incorporated into U.S. government digital signature standards FIPS 186-1 and FIPS 182-2. Detailed descriptions of ECC schemes and their applications may be found in Blake et al. (see I. Blake, G. Seroussi and N. Smart, Elliptic Curves in Cryptography, London Mathematical Society Lecture Notes, Series 265, Cambridge Univ. Press (Cambridge, England 1999), IEEE draft standard P1363 and ANSI standards X9.62 and X9.63. For DL schemes, detailed descriptions and their applications are found in Menezes et al. [A. J. Menezes, P. C. van Oorschot and S. A. Banstone, Handbook of Applied Cryptography, CRC Press (Boca Raton, Fla. 1997)], IEEE draft standard P1363 and ANSI standards X9.30-1 and X9.42.
Certicom Corp. has considered ECC implementations for smart cards [see “The Elliptic Curve Cryptosystem for Smart Cards”, Certicom White Paper, http://www.certicom.com/ resources/w_papers/w_papers.html, Certicom Corp. (San Mateo, Calif. 1998)]. A millisecond performance benchmark for digital signatures is reported for an ECC defined over GF(2163). Since benchmark data was generated with a Sun UltraSparc I, 64-bit, 167 MHz high performance desktop system with an 83 MHz bus, 1.3 GB/s memory transfer, 128 MB ram and 0.5 to 4 MB external cache and typical smart cards commonly employ low end, 8-bit microprocessors which typically operate at around 4 MHz with approximately 128 to 1024 bytes of RAM, 1–16 KB of EEPROM and 6–16 KB of ROM, the reported results are not immediately applicable to smart cards but demonstrate the computation potential of ECC methods. In a previous draft version (http://www.certicom.ca/ exx/wecc4.htm) of the white paper, Certicom Corp. disclosed benchmarks for an ECC digital signature implementation on Siemens SLE44C80S and 16-bit SLE66C80S microcontroller using Koblitz curves and a binary extension field. The use of these specialized elliptic curves limited computation coefficients to only two values, 0 and 1, thus providing faster computation with less security. Digital signature performances of less than 1.5 seconds was reported for the 8-bit microprocessor and 0.7 seconds for the 16-bit microcontroller.
Chung et al. disclose fast finite field and elliptic curve algorithms for embedding cryptographic functions on a high performance CalmRISC 8bit RISC microcontroller with a MAC2424 24-bit high performance coprocessor capable of both 24-bit and 16-bit operation modes [see J. W. Chung, S. G. Sim and P. J. Lee, “Fast Implementation of Elliptic Curve Defined over GF(pm) on Calm RISC with MAC2424 Coprocessor”, CHES 2000, ed. C. K. Koq et al., Lecture Notes in Computer Science, vol. 1965, Springer-Verlag (New York 2000) pp. 57–70]. In 24-bit mode, the MAC2424 coprocessor has two 48-bit multiplier accumulators and two 32 Kb×24-bit data memory and, in 16-bit mode, the coprocessor has two 32-bit multiplier accumulators and two 32 Kb×16-bit data memory. Due to the unique hardware capabilities of the MAC2424 coprocessor, the computational cost of multiplication is the same as addition and the multiplication product of two subfield elements can be accumulated multiple times in the accumulator so that long number arithmetic can be performed without intermediate reduction.
It has been long recognized that efficient finite field arithmetic is vital to achieve acceptable performance with ECCs. While in prior ECC implementations, workers have utilized even-characteristic finite fields with composite extension degree, recent attacks on the security of such approaches has rendered them unattractive. In alternative approaches, some workers such as De Win et al. [see E. De Win, A. Bosselaers, S. Vandenberghe, P. De Gersem, and J. Vandewalle, “A fast software implementation for arithmetic operations in GF(2 n )”, Asiacrypt '96, ed. K. Kim et al., Lecture Notes in Computer Science, vol. 1163, Springer-Verlag (New York 1996) pp. 65–76] have considered the use of fields GF((2n)m), with a focus on n=16, m=11. This construction yields an extension field with 2176 elements. The advantage with this approach is that the subfield GF(216) has a Cayley table of sufficiently small size to fit in the memory of a workstation.
Other workers have offered alternative approaches. Optimizations for multiplication and inversion in such composite fields of characteristic two are disclosed by Paar et al. [see J. Guajardo and C. Paar. “Efficient Algorithms for Elliptic Curve Cryptosystems”, Advances in Cryptology—Crypto '97, ed. B. S. Kaliski, Lecture Notes in Computer Science, vol. 1294, Springer-Verlag (New York 1997), pp. 342–356]. Schroeppel et al. [see R. Schroeppel, H. Orman, S. O'Malley, and O. Spatscheck, “Fast key exchange with elliptic curve systems”, Advances in Cryptology—CRYPTO '95, ed. J. Killian et al., Lecture Notes in Computer Science, vol. 963, Springer-Verlag (New York 1995) pp. 43–56] report an implementation of an elliptic curve analogue of Diffie-Hellman key exchange over GF(2155). The arithmetic is based on a polynomial basis representation of the field elements. De Win et al. [see E. De Win, S. Mister, B. Preneel, and M. Wiener, “On the Performance of Signature Schemes Based on Elliptic Curves”, Algorithmic Number Theory, ed. J. F. Buhler, Lecture Notes in Computer Science, vol. 1423, Springer-Verlag (New York 1998) pp. 252–266] disclose a detailed implementation of elliptic curve arithmetic on a desktop personal computer using finite fields of the form GF(p) and GF(2n) with a focus on its application to digital signature schemes. For ECCs over prime fields, the De Win et al. construction uses projective coordinates to eliminate the need for inversion, along with a balanced ternary representation of the multiplier. Schnorr [see C. P. Schnorr, “Efficient signature generation by smart cards”, Journal of Cryptology, 4(3):161–174, 1991] discloses a digital signature algorithm based on the finite field discrete logarithm problem. The disclosed algorithm is apparently adaptable for smart card implementations.
Paar and co-workers [see D. V. Bailey. “Optimal Extension Fields”, MQP-Senior Thesis, Computer Science Department, Worcester Polytechnic Institute, (Worcester, Mass. 1998); D. V. Bailey and C. Paar, “Optimal Extension Fields for Fast Arithmetic in Public-Key Algorithms”, Advances in Cryptology—CRYPTO '98, ed. H. Krawczyk, Lecture Notes in Computer Science, vol. 1462, Springer-Verlag (New York 1998) pp. 472–485] have recently introduced optimal extension fields (herein “OEF”s) and have provided performance statistics on high-end RISC workstations. Mihailescu [see P. Mihailescu, “Optimal Galois field bases which are not normal”, Fast Software Encryption—FSE '97, 4th International Workshop, Jan. 20–22, 1997, Haifa, Israel, rump session paper] has disclosed an efficient algorithm for exponentiation in an OEF which leads to efficient implementation of cryptosystems based on the finite field discrete logarithm problem. Kobayashi et al. [see T. Kobayashi, H. Morita, K. Kobayashi, and F. Hoshino, “Fast Elliptic Curve Algorithm Combining Frobenius Map and Table Reference to Adapt to Higher Characteristic”, Advances in Cryptography—EUROCRYPT '99, ed. J. Stem, Lecture Notes in Computer Science, vol. 1592, Springer-Verlag (New York 1999) pp. 176–189] have extended the work on OEFs and have reported sub-millisecond performance on high-end RISC workstations and an ECC performance of 1.95 milliseconds on a 400 MHz Pentium II.
Paar and co-workers [see D. V. Bailey and C. Paar, “Efficient Arithmetic in Finite Field Extensions with Application in Elliptic Curve Cryptography”, Journal of Cryptology, 14(3):153–176 (2001)] recently introduced an adaptation of the Itoh-Tsujii inversion algorithm for OEFs which is utilized in the present invention.
As security requirements become more stringent, the computation requirements for long number arithmetic create certain impediments to the continued deployment of low cost smart cards which utilize low end microprocessors to their limited computational capabilities. Creation of a digital signature is frequently the most computationally intensive operation demanded of a typical smart card. In addition, with the proliferation of low cost embedded processors in personal digital assistants (PDAs), wireless devices, smart appliances, building monitors, street sensors, vehicles, equipment and machinery, there has been a growing concern for access security. Thus, it is advantageous to provide fast and efficient cryptographic computation methods which can overcome the hardware deficiencies of low cost microprocessors without requiring more costly microprocessors or additional coprocessor, cache or memory hardware.