Today, computing devices are almost always interconnected via networks. These networks can be large closed networks, as within a corporation, or truly public networks, as with the Internet. A network itself might have hundreds, thousands or even millions of potential users. Consequently it is often required to restrict access to any given networked computer or service, or a part of a networked computer or service, to a subset of the users on the public or closed network. For instance, a brokerage might have a public website accessible to all, but would like to only give Ms. Alice Smith access to Ms. Alice Smith's brokerage account.
Access control is an old problem, tracing its roots to the earliest days of computers. Passwords were among the first techniques used, and to this day remain the most widely used, for protecting resources on a computer or service.
In its simplest form, known as single factor authentication, every user has a unique password and the computer has knowledge of the user password. When attempting to log on Alice would enter her userid, say alice, and password, say apple23, the computer would compare the pair, i.e. alice, apple23, with the pair it had stored for Alice, and if there is a match would establish a session and give Alice access.
This simple scheme suffers from two problems. First, the table containing the passwords is stored on the computer, and thus represents a single point of compromise. If Eve could somehow steal this table, she would be able to access every user's account. A second problem with this approach is that when Alice enters her password it travels from her terminal to the computer in the clear, and Eve could potentially eavesdrop. Such eavesdropping is known as a Man-In-The-Middle attack. For instance the “terminal” could be Alice's PC at home, and the computer could be a server on the Internet, in which case her password travels in the clear on the Internet. It will be recognized by those with ordinary skill in the art that a Man-in-The-Middle attack can go beyond eavesdropping to modify the contents of the communication.
Various solutions have been proposed and implemented to solve these two issues. For instance, to solve the first problem of storing the password on the computer, the computer could instead store a one way function of the password. E.g. F(apple23)=XD45DTY, and the pair {alice, XD45DTY}. In this example as F( ) is a one way function, computing XD45DTY from apple23 is easy, but as it is a “one way function”, the reverse is believed to be computationally difficult or close to impossible. So when Alice logs on and sends the computer {alice, apple23}, the computer can compute F(apple23) and compare the result with XD45DTY. The UNIX operating system was among the first to implement such a system in the 1970's. However, this approach, while solving the problems due to the storage of the password on the computer, does not solve the problem of the password traveling in the clear.
Multiple factor authentication also exists as a solution to the problems inherent with single factor authentication. In multiple factor authentication, at least knowledge of, if not actual possession of, at least two factors must be shown for authentication to be complete. It should be understood that in multiple factor authentication, each factor remains separate. That is, the factors are not combined. Further, the factors are not even concatenated. Several multiple factor authentication techniques exist, including one time password token techniques, encrypted storage techniques, smart card techniques, and split key techniques.
In one time password token techniques, two passwords are utilized, one being a permanent password associated with the user, and the other being a temporary, one-time use, password generated by a password generator. The permanent password may be optional. The temporary password has a finite usable life, such as sixty seconds. At the end of the useable life, another temporary password is generated. An authentication server knows each usable password as well as its useable life, based upon algorithms well known to one of ordinary skill in the art. A user transmits both the permanent password (first factor) and a temporary password (second factor) to the authentication server which then verifies both passwords. The passwords are transmitted in the clear, thus token techniques are subject to man-in-the-middle attacks.
Encrypted storage techniques utilize a cryptographic key, to be discussed further below, stored on either removable media or a hard drive. The cryptographic key is encrypted with a user's password. After decryption with the user's password, the key is then stored, at least temporarily, in memory of the user's computer system where it is used to either encrypt or decrypt information. As will be recognized by one of ordinary skill, this particular approach is undesirable due to it being susceptible to a dictionary attack, to be discussed in detail further below.
In smart card techniques, a private portion of an asymmetric cryptographic key, to be discussed further below, is stored on a smart card, which is portable. A specialized reader attached to a computer system is used to access the smart card. More particularly, the user enters a PIN (the first factor) to ‘unlock’ the smart card. Once unlocked, the smart card encrypts or decrypts information using the key stored thereon. It should be stressed that in smart card techniques the key never leaves the smart card, unlike in the encrypted storage techniques discussed above. Rather, electronics within the smart card itself perform the encrypting and/or decrypting. Smart card techniques are associated with certain problems. These problems include the fact that the technique is costly to implement, due to hardware costs. Further, a lack of readers makes use of a user's smart card difficult, and smart cards themselves are subject to loss.
Before discussing in detail the more sophisticated conventional techniques for authentication, which are based upon split key technology, let us briefly describe symmetric and asymmetric key cryptography.
In symmetric key cryptography, the two parties who want to communicate in private share a common secret key, say K. The sender encrypts messages with K, to generate a cipher, i.e. C=Encrypt(M,K). The receiver decrypts the cipher to retrieve the message, i.e. D=Decrypt(C,K). An attacker who does not know K, and sees C, cannot successfully decrypt the message, if the underlying algorithms are strong. Examples of such systems are DES3 and RC4. Encryption and decryption with symmetric keys provide a confidentiality, or privacy service.
Symmetric keys can also be used to provide integrity and authentication of messages in a network. Integrity and authentication means that the receiver knows who sent a message and that the message has not been modified so it is received as it was sent. Integrity and authentication is achieved by attaching a Message Authentication Code (MAC) to a message M. E.g., the sender computes S=MAC(M,K) and attaches S to the message M. When the message M reaches the destination, the receiver also computes S′=MAC(M,K) and compares S′ with the transmitted value S. If S′=S the verification is successful, otherwise verification fails and the message should be rejected. Early MACs were based on symmetric encryption algorithms such as DES whereas more recently MACs are constructed from message digest functions, or “hash” functions, such as MD5 and SHA-1. The current Internet standard for this purpose is known as hash-based MAC (HMAC).
By combining confidentiality with integrity and authentication, it is possible to achieve both services with symmetric key cryptography. It is generally accepted that different keys should be used for these two services and different keys should be used in different directions between the same two entities for the same service. Thus if Alice encrypts messages to Bob with a shared key K, Bob should use a different shared key K′ to encrypt messages from Bob to Alice. Likewise Alice should use yet another key K″ for MACs from Alice to Bob and Bob should use K′″ for MACs from Bob to Alice. Since this is well understood by those skilled in the art, we will follow the usual custom of talking about a single shared symmetric key between Alice and Bob, with the understanding that strong security requires the use of four different keys.
Symmetric key systems have always suffered from a major problem—namely how to perform key distribution. How do Bob and Alice agree on K? Asymmetric key cryptography was invented to solve this problem. Here every user is associated with two keys, which are related by special mathematical properties. These properties result in the following functionality: a message encrypted with one of the two keys can then only be decrypted with the other.
One of these keys for each user is made public and the other is kept private. Let us denote the former by E, and the latter by D. So Alice knows Dalice, and everyone knows Ealice. To send Alice the symmetric key K, Bob simply sends C=Encrypt(K,Ealice). Alice, and only Alice (since no one else knows Dalice), can decrypt the ciphertext C to recover the message, i.e. Decrypt(C,Dalice)=K. Now both Alice and Bob know K and can use it for encrypting subsequent messages using a symmetric key system. Why not simply encrypt the message itself with the asymmetric system? This is simply because in practice all known asymmetric systems are fairly inefficient, and while they are perfectly useful for encrypting short strings such as K, they are inefficient for large messages.
The above illustrates how asymmetric cryptography can solve the key distribution problem. Asymmetric cryptography can also be used to solve another important problem, that of digital signatures. To sign a message M, Alice encrypts it with her own private key to create S=Encrypt(M,Dalice). She can then send (M,S) to the recipient who can then decrypt S with Alice's public key to generate M′, i.e. M′=Decrypt(S,Ealice). If M′=M then the recipient has a valid signature as only someone who has Dalice, by definition only Alice, can generate S, which can be decrypted with Ealice to produce M. To convey the meaning of these cryptographic operations more clearly they are often written as S=Sign(M,Dalice) and M′=Verify(M,S,Ealice). It is worth noting that asymmetric key digital signatures provide non-repudiation in addition to the integrity and authentication achieved by symmetric key MACs. With MACs the verifier can compute the MAC for any message M of his choice since the computation is based on a shared secret key. With digital signatures this is not possible since only the sender has knowledge of the sender's private key required to compute the signature. The verifier can only verify the signature but not generate it. It will be recognized by those with ordinary skill in this art that there are numerous variations and elaborations of these basic cryptographic operations of symmetric key encryption, symmetric key MAC, asymmetric key encryption and asymmetric key signatures.
The RSA cryptosystem is one system that implements asymmetric cryptography as described above. In particular the RSA cryptosystem allows the same public-private key pair to be used for encryption and for digital signatures. It should be noted there are other asymmetric cryptosystems which implement encryption only e.g., EIGamal or digital signature only, e.g., DSA. Technically the public key in RSA is a pair of numbers E, N and the private key is the pair of numbers D, N. When N is not relevant to the discussion it is commonplace to refer to the public key as E and the private key as D.
Finally, the above description does not answer the important question of how Bob gets Alice's public key Ealice. The process for getting and storing the binding [Alice, Ealice] which binds Ealice to Alice is tricky. The most practical method appears to be to have the binding signed by a common trusted authority. So such a “certificate authority” (CA) can create CERTalice=Sign([Alice, Ealice], Dca). Now CERTalice can be verified by anyone who knows the CA's public key Eca. So in essence, instead of everyone having to know everyone else's public key, everyone only need know a single public key, that of the CA. More elaborate schemes with multiple Certificate Authorities, sometimes having a hierarchical relationship, have also been proposed.
Asymmetric key cryptosystems have been around for a long time, but have found limited use. The primary reasons are twofold: (a) the private key D in most systems is long, which means that users cannot remember them, and they have to either be stored on every computer they use, or carried around on smart cards or other media; and (b) the infrastructure for ensuring a certificate is valid, which is critical, is cumbersome to build, operate, and use. The first technique proposed to validate certificates was to send every recipient a list of all certificates that had been revoked. This clearly does not scale well to an environment with millions of users. The second method proposed was to require that one inquire about the validity of a certificate on-line, which has its own associated problems.
A system based on split private key cryptography has been developed to solve these two issues, among others. In this system the private key for Alice, i.e. Dalice, is further split into two parts, Daa which Alice knows, and a part Das which is stored at a security server. To sign a message, Alice could perform a partial encryption to generate a partial signature, i.e. PS=Sign(M,Das). Alice then sends the server PS which ‘completes’ the signature by performing S=Sign(PS,Dss). This completed signature S is indistinguishable from one generated by the original private key, so the rest of the process works as previously described. However, Daa can be made short, which allows the user to remember it as a password, so this system is consumer friendly. Further, if the server is informed that a particular ID has been revoked, then it will cease to perform its part of the operation for that user, and consequently no further signatures can ever be performed. This provides for instant revocation in a simple highly effective fashion. It will be recognized by those with ordinary skill in the art that use of a split private key for decryption purposes can be similarly accomplished, and that the partial signatures (or decryptions) may be generated in the opposite sequence, that is first on the security server and subsequently by the user's computer, or even be computed concurrently in both places and then combined.
Let us return now to password based systems. Challenge-response systems solve the issue of having to send passwords in the clear across a network. If the computer and Alice share a secret password, P, then the computer can send her a new random challenge, R, at the time of login. Alice computes C=Encrypt(R,P) and sends back C. The computer decrypts Decrypt(C,P)=C′. If C=C′, then the computer can trust that it is Alice at the other end. Note however that the computer had to store P. A more elegant solution can be created using asymmetric cryptography. Now Alice has a private key Dalice, or in a split private key system she has Daa. The computer challenges her to sign a new random challenge R. She signs the challenge, or in the split private key system she interacts with the security server to create the signature, and sends it back to the computer which uses her public key, retrieved from a certificate, to verify the signature. Observe that the computer does not have to know her private key, and that an eavesdropper observing the signature on R gains no knowledge of her private key.
The SSL system, which is widely used on the Internet, in effect implements a more elaborate method of exactly this protocol. SSL has two components, ‘server side SSL’ in which a server proves its identity by correctly decrypting a particular message during connection set-up. As browsers such as Netscape and Microsoft Internet Explorer come loaded with the public keys of various CAs, the browser can verify the certificate of the server and use the public key therein for encryption This authenticates the server to the client, and also allows for the set-up of a session key K, which is used to encrypt and MAC all further communications. Server side SSL is widely used, as the complexity of managing certificates rests with system administrators of web sites who have the technical knowledge to perform this function. The converse function in SSL, client side SSL, which lets a client authenticate herself to a server by means of a digital signature is rarely used, because although the technical mechanism is much the same, it now requires users to manage certificates and long private keys which has proven to be difficult, unless they use the split private key system. So in practice, most Internet web sites use server side SSL to authenticate themselves to the client, and to obtain a secure channel, and from then on use Userid, Password pairs to authenticate the client.
So far from disappearing, the use of passwords has increased dramatically. Passwords themselves are often dubbed as inherently “weak” which is inaccurate, because if they are used carefully passwords can actually achieve “strong” security. As discussed earlier passwords should not be sent over networks, and if possible should not be stored on the receiving computer. Instead, in a “strong” system, the user can be asked to prove knowledge of the password without actually revealing the password. And perhaps most critically passwords should not be vulnerable to dictionary attacks.
Introduced above, dictionary attacks can be classified into three types. In all three types the starting point is a ‘dictionary’ of likely passwords. Unless the system incorporates checks to prevent it, users tend to pick poor passwords, and compilations of lists of widely used poor passwords are widely available.
On line dictionary attack: Here the attacker types in a guess at the password from the dictionary. If the attacker is granted access to the computer they know the guess was correct. These attacks are normally prevented by locking the user account if there are an excessive number of wrong tries. Note that this very commonly used defense prevented one problem, but just created another one. An attacker can systematically go through and lock out the accounts of hundreds or thousands users. Although the attacker did not gain access, now legitimate users cannot access their own accounts either, creating a denial of service problem.
Encrypt dictionary attacks: If somewhere in the operation of the system a ciphertext C=Encrypt(M,P) was created, and the attacker has access to both C and M, then the attacker can compute off-line C1=Encrypt(M,G1), C2=Encrypt(M,G2), . . . where G1, G2, . . . etc. are the guesses at the password P from the dictionary. The attacker stops when he finds a Cn=C, and knows that Gn=P. Observe that the UNIX file system, which uses a one way function F( ) instead of an encryption function E( ), is vulnerable to this attack.
Decrypt dictionary attacks: Here the attacker, does not know M, and only sees the ciphertext C (where C=Encrypt(M,P)). The system is only vulnerable to this attack if it is true that M has some predictable structure. So the attacker tries M1=Decrypt(C,G1), M2=Decrypt(C,G2) . . . , and stops when the Mi has the structure he is looking for. For instance Mi could be known to be a timestamp, English text, or a number with special properties such as a prime, or a composite number with no small factors. Those with ordinary skill in the art will recognize there are numerous variations of the encrypt and decrypt dictionary attacks.
In split private key systems the user portion of the private key, referred to as Daa above, may come from the user's password only. Thus, a compromise of the password, i.e, another person learning a user's password, results in a compromise of the split private key system. Also, there still remains the possibility of a dictionary attack on the server portion of the private key, referred to as Das above, because the user portion of the private key comes from the user's password only. Thereby knowledge of Das enables a dictionary attack on Daa. Further, and as discussed above, existing multiple factor systems that overcome these problems rely upon expensive hardware. Because of this and other reasons, such systems have failed to gain support. Thus, there remains a need for a multifactor cryptographic system which overcomes the problems of the prior art.