Computer system user authentication, that is, providing evidence of a user's identity to a computer system and then verification of that evidence by the computer system, has become a familiar task for millions of computer system users. User authentication is a cornerstone of computer system security. The ability for a computer system to authenticate user identity enables the enforcement of computer system security policies from simple yes/no data access to sophisticated resource privilege management.
It is common to classify user authentication into three main types based on how the user provides evidence of their identity to the computer system. In knowledge-based user authentication, the user provides information to the computer system that the user has previously memorized, such as a password. In token-based user authentication, the possession of a particular object, such as a key card incorporating computer-readable media, provides the evidence. In biometric user authentication, the user's biological characteristics, such as the user's fingerprints, are detected by the computer system and compared to the characteristics on record. In addition, combinations of these authentication types are possible, for example, an automated teller machine (ATM) requires a user to provide both a token (an ATM card) and memorized information (a personal identification number or PIN) before it will dispense cash.
Knowledge-based user authentication is popular, at least in part because it is relatively inexpensive and convenient. A user theoretically always has their authentication credentials with them and typically no specialized computer system equipment, such as a card reader or biometric detector, is required to input the authentication information. However, there is a conflict at the heart of conventional knowledge-based user authentication that undermines its effectiveness in practice: authentication information that provides the best evidence of a user's identity, i.e., authentication information in an information space with high entropy, is often difficult to remember.
In the context of knowledge-based user authentication, an information space with high entropy is an information space that contains a relatively high number of possible authentication values, and an information space with low entropy contains a relatively low number of possible values. The entropy of an information space is conveniently measured in bits. For example, if the authentication information to be provided by the user is the result of a previous coin toss, the associated information space contains only ‘heads’ and ‘tails’. In this case the information space has an entropy of 1 bit. If the authentication information to be provided by the user is the result of a previous dice throw (ideally known only to the user and the computer system), the information space contains six values and so has an entropy of approximately 2.6 bits (log2(6)≈2.6). If the authentication information to be provided by the user is a word randomly selected from a dictionary of one hundred thousand words, the information space (the dictionary) has an entropy of approximately 17 bits (log2(100000)≈17).
The result of a single coin toss is relatively easy to remember, but it is relatively poor evidence of a computer system user's identity. If the result of a single coin toss is the information that a user must know in order to authenticate, an attacker (someone who is not the user but who is attempting to authenticate as the user) has a 50% chance of authenticating as the user on the first attempt and a 100% chance given two attempts. Knowing a random selection from a one hundred thousand word dictionary provides better evidence of a computer system user's identity. The probability is much lower that an attacker will simply guess the word in the first few attempts. Even given the dictionary and ten attempts, the probability of authenticating as the user is only 1 in 10,000 if the attacker is guessing randomly. However, an attacker with the dictionary and one hundred thousand attempts to guess the word still has a 100% chance of authenticating as that computer system user. There are circumstances in which one hundred thousand guess attempts are well within the capabilities of an attacker equipped with a modem computer system, so that while the dictionary has higher entropy than the single coin toss, its entropy is still insufficient in those circumstances. In addition, even where measures are taken to limit the number of possible guess attempts, a group of users may still be vulnerable, for example, to a “horizontal attack.” That is, if each of the fifty thousand users of an organization select a password from a one hundred thousand word dictionary, then an attacker with just two random guesses for each user still has a better-than-even chance of guessing the password of at least one of the organization's users.
A way to make guessing, dictionary and like attacks unprofitable in the context of knowledge-based user authentication is to ensure that the authentication information for each user is selected from a high entropy information space. For example, in the case of the common alphanumeric password, a naïve analysis is as follows: if a 6 character alphanumeric password with 40 bits of entropy (log2(966)≈40; assuming 96 alphanumeric characters on a standard English language keyboard) is insufficient, then require of the user an 8 character alphanumeric password with 53 bits of entropy (log2(968)≈53). If that is still insufficient, simply require a 10 character alphanumeric password with 66 bits of entropy (log2(9610)≈66), and so on.
A problem in practice is that a random 8 character alphanumeric password typically looks something like: bX#zOk%h. That is, it looks random (without pattern). If a user is permitted to choose their own password, it is uncommon for their choice to be random, at least in part because random passwords are typically hard to remember. A more common choice is one or more natural language (e.g., English language) words and simple (i.e., low entropy) variations thereon. Even if the user has a vocabulary of one hundred thousand words, the entropy of the password is much reduced compared to the random case and the simple “more characters gives more entropy” analysis does not apply. If a user is not permitted to choose their own password, it is common for the user to write down the password, thus undermining knowledge-based authentication in another, and potentially even more damaging, way. In short, human memory quickly becomes the limiting factor in knowledge-based user authentication.
For at least some people, remembering images is easier than remembering text. Conventional knowledge-based user authentication centered around images is conveniently grouped into two categories: recall-based and recognition-based. In image recall-based user authentication, a user in some way recreates an image that they previously created. In image recognition-based user authentication, a user selects an image or sub-image that they previously selected.
Conventional knowledge-based user authentication centered around images has drawbacks. Image recall-based user authentication typically requires precise recall of images, which is not necessarily an improvement over precise recall of text. Image recognition-based user authentication typically takes a relatively long time compared to entering a text password, especially if it seeks to achieve parity in terms of authentication information entropy by requiring a high number of image recognitions. Each image recognition typically contributes at most a single bit of authentication information entropy, e.g., “recognize” or “don't-recognize” an image in a set of images. Conventional knowledge-based user authentication centered around images also typically employs a graphical input mechanism via a graphical output device, resulting in authentication information being exposed on an output device and thus an additional avenue of compromise by an attacker. In addition, conventional knowledge-based user authentication centered around images is typically incompatible with existing text-based user authentication.
There is a need in the art for a system and method of knowledge-based user authentication that helps users select and remember authentication information from high entropy information spaces. Where the system and method employs images as memory aids, it should do so, where possible, without the drawbacks associated with conventional knowledge-based user authentication centered around images. Ideally, such a system and method would compliment and enhance conventional text password-based user authentication rather than replacing it, so as to retain, where possible, the benefits of conventional text password-based user authentication that have made it so popular.