This disclosure relates generally to computer and network security. Computers which are interconnected via communications networks are now widely used in commerce and government operations. Although networked computers enable performance of a wide variety of useful tasks, they also create vulnerability to remote attacks from other computers. This is problematic because the computers and networks operated by organizations such as businesses and government may store sensitive data and include software which can be used to control operation of physical devices and financial transactions. Consequently, it is desirable to repel attempts by attackers to gain access to computers and computer networks.
A widely used technique for preventing attackers from gaining access to computers and computer networks is password authentication. Password authentication is a process by which a login attempt to a computer or network is authenticated by determining whether a password and username provided by the computer attempting to login are valid. For example, a user attempting to gain access to a host site may be prompted to send their username and password to the host site. The host site would then determine whether the username and password match a username and corresponding password previously stored by the host site. If a matching username and corresponding password are found then access to the host site is granted. If a matching username and corresponding password are not found then access to the host site is denied.
Password authentication has some vulnerabilities which may enable an attacker to gain access by posing as a legitimate user. For example, weak usernames and passwords may be susceptible to being guessed by an attacker. Human beings tend to select weak usernames and passwords that can be easily committed to memory, such as words or combinations of words in human language. This renders password authentication vulnerable to so-called “dictionary attacks” where an attacker uses a database, such as an English language dictionary, to repeatedly attempt to gain access to a host site using different combinations of database entries until a valid username and corresponding password are found. Another vulnerability is that usernames and passwords can be stolen. For example, an attacker may gain access to a host site and obtain the stored usernames and passwords of multiple users. These vulnerabilities may be exploited in combination, e.g., by stealing usernames and guessing the corresponding passwords via a dictionary attack.
Hashing can be used to mitigate some of the vulnerabilities of password authentication by eliminating host site storage of passwords. For the purposes of the present disclosure, a cryptographically secure hash function, sometimes referred to herein simply as a “hash function,” is a one-way function that maps any variable amount of input data into a fixed-size output value referred to herein as a digest. Different input data results in different digests, and small changes to the input data do not necessarily correspond to small changes in the resulting digest. A host site may hash data such as passwords so that the passwords are not stored in the clear. When the user sends a password to the host site at login, the host site applies the hash function to the received password to generate a new digest. The host site attempts to find a matching digest associated with the username in a set of stored password digests. Thus, an attacker who gains access to the host site may steal password digests, but not passwords stored in the clear. However, while a hash function is considered effectively irreversible, i.e. that it is not possible to compute the original data from only its digest, there are other known ways by which the original data such as a password can be recovered.
One limitation with protecting passwords by hashing is that an attacker may steal usernames and password digests to exploit password weakness to compromise user accounts. As explained above, human beings tend to select passwords that can be committed to memory, such as words. Further, multiple users may select the same word or words as their password. This is problematic because those matching passwords result in the same digest. Matching digests may be identified from stolen password digests and the users with matching digests may be targeted by an attacker. For example, the attacker may learn the password of one of the users using other means, such as a “spear phishing attack,” and then use the password to gain access to accounts of the other users who use the same password as indicated by the matching digests.
One way to improve the strength of passwords selected by human beings is by salting. Salting is a technique for randomly or pseudo-randomly modifying an input. In one example, unique random strings referred to as a “salts” or “keys” are combined with passwords so that hashing does not generate matching digests even if the passwords selected by multiple users match. In another example a keyed hash function uses the password and the salt as separate arguments or inputs. The keyed hash function may combine the password and the salt in a way that is more rigorous than appending the salt to the password. Regardless of which salting technique is used, when creating a user account the user may be asked to enter a username and a password. The host site receives the username and password from the user and generates a unique salt which is combined with the password and hashed to generate a digest. The host site stores the username, the salt, and the digest. The host site may discard the password so that it is not vulnerable to being stolen in the clear. When the user later attempts to login to the host site, e.g. to access a password protected computer system, the user provides their username and password to the host site password authentication system. The password authentication system attempts to authenticate the login attempt by using the received username to retrieve the corresponding salt and digest. The password authentication system combines the retrieved salt with the password received from the user using the same technique as when the password was initially entered. The password authentication system applies the hash function to the salted password to produce a new digest. The password authentication system compares the new digest to the retrieved digest. If the new digest and the retrieved digest match, the login attempt is determined to be valid. If the new digest and the retrieved digest do not match, the login attempt is determined to be invalid.
Password authentication systems which use salting and hashing as described above are still vulnerable to attack. For example, an attacker may gain access to the host site and steal stored usernames, salts and digests. Later, on an attacker computer, the attacker can then repeatedly attempt to authenticate possible passwords using the stolen usernames, salts and digests in an “offline attack” until the attacker finds at least one salted password that hashes to a stolen digest. The attacker can then use that information to gain access to the host site.