Authentication and Man in the Middle Phishing
Many authentication techniques exist which allow an entity (e.g. a user or a web server) to prove its identity to another entity. Often these systems are based on the existence of a shared secret. For instance, revealing knowledge of a shared password is a very common method. Or, one can have ‘one time passwords’ which are generated based on a shared secret, with both parties having the ability to compute the one time password.
Some systems do not rely on shared secrets, and instead use a technique called public-key cryptography. Here the user proves knowledge of a secret, for instance, by using it to sign a message, but does not reveal the secret itself. The signature is typically verified using information that is unique to the user, but is public. Public key cryptography is typically implemented using a technology called digital certificates. In general, systems based on public key cryptography are considered more secure, but are not as widely used because they are cumbersome (especially for human users; as opposed to computer servers).
Almost all these techniques are vulnerable to the insertion of an attacker in between the legitimate parties. Such an attack is known as a man in the middle (MITM) attack. This has led to the widespread incidence of so-called phishing attacks. Two types of phishing attacks exist, off line and real time. In the off-line case, the MITM simply fools the user into giving up their secret, and at a later time, can enter the password into the legitimate web site. In the on-line case, the man in the middle attacker ferries traffic back and forth in real time. In this case even if the secret is short lived, e.g. a hardware token with a secret number that changes every thirty seconds, the session can be phished. Both such attacks have been widely observed in practice.
It is important to note that the use of stronger techniques like public key cryptography by themselves do not guarantee protection against a man in the middle. Consider a simple example. Assume the web server requires the user to use public key cryptography to sign a fresh challenge in order to authenticate. In this case a real time MITM could simply get the challenge from the web server, transmit that challenge to the user, who will sign it and return it to the MITM, who returns it the legitimate web server. The web server is satisfied and will let the MITM access the system! While this simple use of public key cryptography is easily seen to be insecure, more secure protocols exist which prevent MITMs. For instance the Secure Sockets Layer (SSL) protocol when used with mutual authentication (defined later), can thwart a MITM attacker. While SSL is very widely used, it is rarely used with mutual authentication.
The Secure Sockets Layer Protocol
The SSL protocol (which has been renamed the Transport Layer Security or TLS protocol) is one of the most widely used security protocols on the Internet. As is evident from the name it has been designed to be a two entity protocol most generally used to secure “sockets” (or more generally the “transport layer” in a communication protocol). For instance, on the Internet, which uses TCP/IP, SSL is used to take a “TCP socket” between two entities and make it a “secure socket”. Once such a “secure socket” has been established, application level protocols like HTTP can be run between the two entities over the secure socket (HTTP over SSL is at times referred to as HTTPS for brevity).
Note that while SSL is described as an end to end protocol, the actual packets carrying the SSL traffic might go through many intermediate hops. e.g. in the classic case where TCP/IP is used as the transport the IP packets might traverse many different nodes. However, the intermediate nodes play no part in processing the SSL messages, and for them it is simply data being transported. Similarly, others have proposed or implemented SSL as a two entity protocol used over wireless, used over datagram services, used over the SOAP standard, etc. All these variations of SSL do not change the fundamental two entity end to end authentication and key exchange purpose of SSL, and the presence of intermediate points play no role in the processing of SSL. This work has no bearing on our invention, which will introduce an active man in the middle necessary for correct protocol functioning.
There have also been numerous implementations of what are sometimes referred to as ‘SSL proxies’. Here there is a proxy or gateway between the end points. However, there is no longer one SSL connection between the end points. Rather, there is a SSL connection from one end point to the gateway, and then another SSL connection between the gateway and the other end point. This also has no bearing on our work, which is focused on a single SSL session with end to end security.
SSL can be used to perform three functions to secure a connection between Entity 1 and Entity 2:                1. Entity 1 (often a user at a browser) can authenticate Entity 2 (often a web service), if Entity 2 has a trusted digital certificate.        2. Provide for encrypted communication between Entity 1 and Entity 2.        3. Can optionally be used to authenticate Entity 1 to Entity 2 (mutual authentication), if Entity 1 has a trusted digital certificate.        
In general when Entity 1 is a user at a browser, and Entity 2 is a web server, then only the first two steps are used. As an example, any user can visit the USPTO at http://sas.uspto.gov/ptosas/ and set up a HTTP over SSL connection. However, at that point, while the user has authenticated the USPTO web site (and has an encrypted session), the USPTO has not authenticated the user. For this to happen the user would need a digital certificate.
In practice it is easy for organizations/servers to possess digital certificates. For instance in the example above, the USPTO could have purchased the digital certificate it uses to secure its web site for literally less than ten dollars and have set it up in a few minutes. On the other hand, it has proven very difficult for individual users to obtain, carry and use certificates. As an example, the USPTO has a program to issuer customers with certificates. (see https://sas.uspto.gov/enroll/traditional-client-zf-create.html), and it can be easily seen that giving users certificates and managing them on an ongoing fashion is difficult and costly.
For these reasons, SSL is typically used to authenticate a web site (e.g. USPTO) to a user's browser, but not typically the other way around. The exception to this would be when SSL is used to secure server to server communication. As it is simple for both servers to be set up with digital certificates, in such cases SSL is often used with mutual authentication.
SSL works in two steps, first Entity 1 and Entity 2 perform a ‘handshake’ in the course of which the authentication and the key exchange for encryption are performed, and a variety of other parameters are exchanged. Once the ‘handshake’ is complete the two parties can communicate securely using a shared master_secret. As it is relevant to our future discussion, we will describe the SSL handshake (with mutual authentication). Our description is meant to convey the essence of the protocol, and is not meant to be a detailed description for which we refer the reader to the Internet standard.
FIG. 1 shows the standard SSL handshake. To begin with, it is assumed that both entities (referred to as Server 1 and Server 2 for convenience) have digital certificates issued by authorities the other party trusts. The protocol begins with a handshake mechanism which consists of four message exchanges:                SSL-Handshake-1 (aka CLIENT-HELLO) Server 1 sends a message to Server2 which among other things contains a random number, which we call R1. [R1]        SSL-Handshake-2 (aka SERVER-HELLO) Server 2 replies with another random number R2, its own digital certificate, and a request for mutual authentication (somewhat misleadingly called the Certificate-Request). [R2, Cert2, Request Cert1]        SSL-Handshake-3 (aka CLIENT-KEY-EXCHANGE) Server 1 verifies the authenticity of Server 2's certificate, and in the process extracts Server 2's public key. It then encrypts a third random number which we call R3 with this public key. It further signs a running_hash of all messages exchanged up to that point with its own private key. Server 1 then sends the encrypted R3, the signed running hash, and its own certificate to Server 2. [encrypt(R3,Cert2),Sign(running_hash,Cert1), Cert1]. Server 1 also combines R1, R2 and R3 to create a master_secret.        SSL-Handshake-4 (SERVER-FINISHED) On receiving the above message, Server 2 uses its own private key to recover R3 from the encrypted packet. It then verifies the authenticity of Cert1 and extracts Server1's public key, which it then uses to verify the signature on the running_hash. If the signature was valid, then at this point Server 2 has authenticated Server 1. It then combines R1, R2 and R3 to create the master_secret. Finally, it sends a message to Server 1 encrypted with the master_secret. encrypt(Done, master_secret). On receiving this message Server 1 will attempt to decrypt it using the master_secret it independently computed in Step 3. If the decryption is correct then Server 1 has authenticated Server2.        Both parties have now authenticated each other and share a secret the master_secret, which they can use for further communication with each other.        
What we have described is the handshake with mutual authentication which assumes both parties have certificates. Often one side, typically a user at a browser, will not have a certificate, but the other side, e.g. the USPTO web site, will have a certificate. In this case the web site will not request mutual authentication, and the browser will not sign the running_hash. Otherwise the rest of the protocol remains the same. While this has some value, the MITM protection only comes into play when mutual authentication is used. This is why phishing has been widespread in spite of SSL being deployed widely.
In the event that two entities have previously exchanged a master_secret, which they have retained, the protocol provides a way for them to resume communications over a new transport, using the existing parameters. In this “abbreviated handshake”:                The first handshake message from the first entity to the second entity contains the SessionID of the previous session.        If the second entity is willing and able to resume the previous session, the reply contains the same SessionID, and a message encrypted with the previous master_secret.        If the first entity successfully decrypts the message then it in effect authenticates the second entity. It then responds with its own message encrypted with the master_secret. The second entity can decrypt this message thus authenticating the first entity.        
This allows the two entities to resume the session without having to perform any operations involving public key cryptography (which is resource intensive).