1. Field of the Invention
The present invention generally relates to biometric identification, and more particularly to biometric identification of users of a keyboard system, in which a user is identified by characteristics of his/her inputting of data.
2. Background Information
People have long known about muscle-memory, and it is known that people have unique “typing” styles. In World War II, the recognized sending style of a telegrapher was called the “Fist of the Sender.” Experienced Morse code operators could recognize each other by their unique styles and this was exploited to ensure message authenticity. Muscle-memory and unique typing patterns are real.
For more than twenty years, various people have tried to develop a way to recognize these unique patterns in an effort to apply them to computer security. Dr. James Young and Robert Hammon of SRI International conducted significant research, and were granted U.S. Pat. No. 4,805,222 in 1989. The technology of that patent has not been implemented. The inventors believe that the reason is that people only have these unique typing patterns under certain, well-defined circumstances. In order to find these patterns, one must understand the circumstances under which they occur. Dr. Young and Mr. Hammon looked for “global” patterns. In other words, they expected that if people simply “typed” there would be distinct and consistent patterns. The present inventors assert that these do not exist. Hence, these algorithms, while finding some effects, miss the main point and fail to provide an accurate and useable metric. Exemplary is the concept of “continuous verification.” Here, Dr. Young and Mr. Hammon track a person's every keystroke (and presumptively mouse-movement) all the time. Then, each keystroke and movement is compared to the stored “template” or signature to “continuously assure” the subject's identity. In order to function at all, there must be a pattern to check. Patterns are not present in random keyboard use. These unique and predictable patterns only develop with repetition and the development of memory. Further, certain patterns are more “learned” and reliable than others are. Rather than people having a “pattern,” people have a series of mini-patterns that vary in quality interspersed among various quantities of “noise.” The Young patent also includes evaluating the pressure applied to a key, which is not ascertainable from a standard computer keyboard.
Many verification technologies, indeed all prior art attempting to utilize keystroke information, make the assumption that people “have” patterns and that it is just a question of looking for them somehow. There have been numerous different methods of searching for these patterns proposed, from statistics to “neural networks.” Generally, a subject is asked to type, or key, a certain phrase or key sequences into a system some number of times. Then, using these samples, the prior art “looks for” or “learns” the pattern, based on whatever data was in the samples given. Nowhere is there an understanding of what constitutes a “good” sample. This is a key flaw. The present inventors assert that patterns only develop over time as people commit the sequence to memory and develop stable “muscle-memories.”
In order to search for, and find, good patterns to “learn,” you must know in advance what “good” patterns look like. In other words, an advance metric of quality is required. This advance knowledge is lacking in the prior art. The result is a situation where a subject can enter the key sequences using any combination of patterns, fast to slow, smooth to jumpy, and the system will “accept” these as the user's “pattern.” Clearly, this in not really a stable pattern and any conclusions drawn from it will be faulty. Indeed, it is possible to introduce so much variation into the samples that no valid conclusions can be drawn, and a signature/template drawn from such a widely-varied input will have little-to-no ability to actually discriminate between people.
The present inventors correct this fatal flaw by defining, in advance, measures for pattern “goodness” or quality. Subjects are required to enter sample information repeatedly, until they exhibit sound, solid, and well-learned patterns. In the present invention, these patterns are represented by mini-rhythms, which are developed by the subject as they “learn” to type or key their information. These mini-rhythms are very stable, and are evidence that the subject has successfully “learned.” Note, in the present invention it is the subject who learns, not the system. In all prior art, it was the system that “learned.” In the present invention, the system establishes metrics for successful learning and then causes the subject to meet these standards. These standards can be set at various levels, from low-to-high, depending on the security need. In this manner, the learning effort required of the subject is fully commensurate with the required security of the system.
Many verification technologies use data created by entering a string of characters. They utilize the entire string thus created. The technology of the present invention does not use the entire string of information, and this is a fundamental concept of the present invention. The technology of the invention teaches that there are only certain small areas in the keystroke pattern that are reliable enough for biometric testing, especially if the use is to protect with any certainty in a high-security use. The fact is that most of the keystroke timing data is too noisy and volatile for use. Most of the time people do NOT have unique rhythms.
Therefore, any testing for these rhythms, no matter how statistically sophisticated will fall short in several material ways.
U.S. Pat. No. 4,621,334 to Garcia exemplifies the prior art approach to keyboard style recognition and security. First, Garcia has no concept of what constitutes a good pattern in the first place. Garcia simple asks a subject to type a number of samples and from that derives a pattern, whether a pattern truly exists or not. Garcia just uses any samples presented. In measuring keystroke data, Garcia uses flight time only, not the dwell time. The “database” or “electronic signature” is recorded by typing the individual's name a number of times. The time delays between each successive keystroke are recorded. These time delays include the spacing between every letter and between every space in the name of the individual. This is different from the present system, which may also utilize dwell time, and utilizes only the most statistically significant portion of the entered information. Further, Garcia will create this electronic signature from samples that are complete noise. The present invention avoids these pitfalls.
Garcia utilizes a subject's name as the test phrase. Names can be any number of characters, so this feature of Garcia indicates that phrase length is not an issue. Garcia states “(i)n practice, it has been found that the best data is derived when an individual types his own name. Apparently, the degree of familiarity and the emotional involvement of the input contribute to the stability and uniqueness of the electronic signature.” The present inventors assert that rhythms develop only after successful learning has taken place. This is the explanation for why Garcia found better results using the subject's name; it is familiar and often typed and therefore often more learned. However, using the subject's name is no guarantee of a good signature. Some subjects have not typed their name often. In addition, some subjects may actively cheat or introduce purposeful variability. Using a familiar phrase will improve results in the Garcia method if the subject cooperates and if the subject has already typed his/her name often enough to generate a reasonably consistent pattern. However, this is no substitute for establishing standards of signature quality in the first place, as is done in the present system. In addition, the present inventors assert that the phrase length is a major issue. In the present invention, a longer phrase length results in a greater number of qualified mini-rhythms when successful learning has taken place. The greater the number of qualified mini-rhythms present, the greater the system's ability to discriminate between the “real” subject and an imposter. Additionally, the greater the number of qualified mini-rhythms present, the easier it is for the real user to meet the enrollment standards (measures of signature goodness) required when security standards are set at a high level. In the present system, a subject could use a phrase as small as four characters, like a PIN, and the system would provide significant discrimination for a low-value transaction. In a high-value or mission-critical high security environment a phrase length of 15-20 or more might be selected. The higher phrase length would result in greater discrimination, and much less chance of an imposter successfully penetrating the system.
In Garcia, a single number test is utilized. After repeated entries, a mean time is found for each flight time variable. The resulting means, or averages, are them themselves averaged. The result is a single deterministic number. This is used as the pass/fail threshold. Quoting from Garcia, “(i)f an entry by an authorized individual has a Mahalanobis distance function value of 50 or less, he can be immediately authorized. In contrast, if the Mahalanobis distance function value is greater than 100, he should be rejected.” This means that the entire sequence from the first letter to last is being used. Unfortunately, there is no consistent pattern across the entire phrase. Most keystroke typing patterns in the sequence are too variable for practical use. Therefore, Garcia teaches away from the selective use of only the most statistically relevant values, which is one of the core tenets of the technology of the invention. Further, the concept of a “single number,” “pass/fail” system is inherently weak as it introduces the certainty of mistakes. A real subject will often be “rejected” while an imposter might often be “accepted.” When you consider the averaging of averages of data that is largely invalid in the first place, it is clear errors are the norm under Garcia's method.
By contrast, the present invention 1) only allows good samples; 2) identifies the stable mini-rhythm portions of the samples; and 3) returns a “risk measure” via a number on a granular scale based on the number of mini-rhythms in the sample. This risk measure is more sensitive and useful than the Pass/Fail measure of Garcia, and other similar prior art. The present technology eliminates the fundamental errors found in Garcia and the prior art by accurately measuring just the “real” pattern against a granular scale. In addition, the present invention provides more and more useful information than contained in a single Pass/Fail metric. The present invention identifies “transactions at risk” and provides detail on the “degree” of risk. These metrics may trigger alerts, silent or overt, or trigger other events like additional system challenges or phone calls on a selective basis depending on the “degree.” For instance, in the present invention, a subject may have twenty mini-rhythms detected and recorded for a particular phrase. A subject may, under normal conditions, be expected to exhibit 18-20 of his/her mini-rhythms on any given verification attempt. An administrator using the present system might set different actions to trigger depending on the range: 0-10, 11-15, 16-18, and 18-20 for instance. This granular “risk” output is unique to the present system and adds significant value.
Garcia asserts that it is possible, using his technique, to recognize random typing patterns. This is impossible. This is because the “patterns” he is looking for do not in fact exist. This is an insurmountable problem that is fundamental to Garcia and other prior art. Garcia believes that humans have patterns that exist in all typing and all one has to do is find them statistically. It has been assumed that the pattern recognition database of Garcia will be generated from multiple entries of a unique password, and access to the system is obtained by entering the same password. This approach can be referred to as simple discrimination.
A more intricate approach can be implemented and is characterized as complex discrimination, or complex signature. Complex discrimination is based not on a typing pattern derived from a specific message, but on a mathematical model that predicts a priori—a person's typing pattern for any given message, even if it never has been typed before. In order to utilize the complex signature, it would be necessary for the individual to type out at least 1,000 of the most common words in the English language. The words would be presented in a series of constrained phrases, typically generated randomly by a parser program that assures that verbs, nouns, adjectives, and prepositions are in correct, grammatical order. Over the years, people have tried this method, and have found it to be unworkable. Some correlations may be found which are statistically valid. However, these do not work out in practice as real users have far too many variations in typing timing. In other words, the variations are so numerous that they really have no patterns in the first place, except in certain non-typical circumstances. The result is frustration for the real user from frequent rejection, and poor security value as the imposter is often accepted. People do not have “universal,” “all-the-time” patterns. Rather people have “some” patterns, “some” of the time. The present invention illuminates what these patterns are, specifies when they exist, and recognizes them when it sees them. The present invention is the first system that formally does any of these things.
Smith, U.S. Pat. No. 5,721,765, is for a personal identification number system. Although Smith used keystrokes and groups the keystrokes together, Smith is not similar to the technology of the present invention. What Smith does is 1) break the PIN up into “bank-assigned” groups; and 2) tells the subject to enter the numbers in these groups. Essentially this is making the “group pauses” a part of the password. Thus, the true subject must have “secret knowledge” and his style of entering the numbers is not a biometric per se. A password is a form of secret knowledge in which no one but the subject supposedly knows the password. The disadvantages of using a password security system are the reason a security system based on a biometric is desirable. A password can be stolen, copied, guessed, or coerced. A true biometric resides within the individual and cannot be lost, copied, or stolen. Clearly, anyone knowing where the “pauses” go could successfully masquerade as the real user. In essence, Smith has described a variation on a password system and does not have a biometric at all.
Brown et al., U.S. Pat. No. 5,557,686, is a method and apparatus for verification of a computer user's identification based on keystroke characteristics. This patent is very similar to the Smith patent. The “purifying” mentioned is to eliminate “outliers” during sample collection, with the idea being that you only want a “signature” based on good samples. This is a good idea. You do want to have a good signature. However, Brown does not define what constitutes “good.” Brown takes all the samples given, with whatever variation is present, and tries to find the patterns whether there or not. The concept of “outliers” therefore just means samples that are “really” different from some overall averages. Again the entire sample set may be bogus and lacking in real patterns either because the real user has not yet developed solid patterns, or because of intentional sabotage. Brown's system is still “garbage in/garbage out” and, though improved via the provision for outliers, still devoid of the concept of quantifiable standards of sample quality in the development of signatures. Brown assumes people “have” patterns when they do not. People develop patterns. Brown does not understand this, and like all similar prior art, is fatally flawed.
In addition, Brown looks at the entire “signal.” This means from the first key down to the last key, up as a gestalt sample. A neural network then looks for patterns in the signal. No matter how well a neural network, or in fact any recognition technology, works it is fundamentally dependent on having good start patterns. Brown does not specify what good is and therefore cannot recognize it, regardless of the sophistication of his techniques. This fatal flaw renders the system practically useless as a human recognition system, and certainly useless in any high-security environment.
The technology of the present invention also eliminates outliers, but only for the subject's convenience. Outliers (atypical samples) “mask” the mini-rhythms sought by the present technology, which may otherwise be developing nicely, and causes the subject to enter more samples than would be necessary if the outliers were eliminated. The present technology defines “goodness” based on defined statistical qualities, not overall signal matches. No amount of filtering by the neural net of Brown can compensate for the problem of perceiving good samples from a large volume of bad samples, and actually teaches away from identifying the mini-rhythms from within the sample. You must have good samples to start with, or everything done thereafter is useless. Brown's neural network looks for patterns in the noise, whether there are real patterns present or not. The present invention insists on good start samples, articulates where the real patterns are, and finds them.
Brown discusses a “threshold of similarity,” but again Brown considers the entire signal as an entity. This includes much noise, to the point of making the technology useless for any high security application. For any application where the real user is routinely able to gain access, the informed intruder would also get in easily. In other words, a determined, skilled, and informed imposter will be able to defeat or “spoof” the security system of Brown.
This is not so with the mini-rhythms approach of the present invention. The mini-rhythms are highly reliable because the real user almost always does them, whether being tested or not. The mini-rhythms are scattered throughout the selected sample phrase. Even the real user does not know where they are. A determined, skilled, and informed imposter will have to hit every variable exactly (something even the real user cannot do) to be sure of hitting the mini-rhythms buried in there somewhere. This is another fundamental difference between Brown and the present invention. The present invention overcomes the fatal problems present in Brown's strategy and represents a fundamentally different approach, and yields a correct solution 100% of the time.
Primeaux et al., U.S. Pat. No. 6,334,121, is for a “usage pattern based user authenticator.” Primeaux strictly looks at usage patterns, which is similar to what banks and credit card companies have been doing for years. If a centegenarian widow goes to a stereo store to buys a $5,000 stereo, she will get a call to verify that it was really her using her card, because she does not usually by $5,000 stereos, but credit card thieves do.
Cho et al., U.S. Pat. No. 6,151,593, is for an apparatus for authenticating an individual based on a typing pattern and using a neural network system to analyze the typing pattern. Cho does not use the mini-rhythm concept. Instead, they use the neural net to look at the gestalt pattern of the typing sample. This approach is quite a different concept than the mini-rhythms technology. Essentially, the neural network is a “pattern finder” and a method of discerning patterns within a noisy input set. As such, it depends on the assumption that there are patterns in the first place. Unfortunately, this is not always the case. Indeed, it is not even usually the case. The present inventors teach that patterns only develop after successful learning has taken place. People do not have “native” patterns. Further, people learn at different rates. Cho simply accepts whatever samples are given without discrimination, relying on the neural network to sort out the “noise.” This will only work in those situations where 1) the subject has already learned stable patterns; 2) the subject is actively cooperating; and 3) the subject displays these patterns in consistent enough manner to “break through” the noise that is also present. This results in a system that will somewhat work, some of the time. In contrast, the present invention overcomes these fatal and systemic problems and generates signatures and recognition systems that work 100% of the time.
Cho is also a Pass/Fail system. The technology of the present invention looks at mini-rhythms as being “warning bells.” If there are ten mini-rhythms in the sample text, for example, it might be “normal” for the real user to miss 0, 1 or 2. That level of success might be called “Green.” As more mini-rhythms are missed, and more “bells” go off, the certainty develops that there is a “real-time” problem. The “real-time” is important. Actually it is more important (in many cases) to “catch” the intruder than to “stop” them, which a system running mini-rhythms can accomplish.
Cho et al. is subject to all the criticisms given earlier with respect to the similar systems of Brown and Garcia. The concept of “garbage in/garbage out” still applies. The flaws are fundamental and fatal. The solutions to these problems are significant components of the present invention.
Kroll, U.S. Pat. No. 6,062,174, is an “ATM signature security system.” Kroll is a patent directed strictly to ATMs. Like other prior art, Kroll looks at the entire sample in combination with the type of ATM machine to determine acceptability. It does not require the subject to develop mini-rhythms, and does not analyze the sample for qualified mini-rhythms.
Kroll, U.S. Pat. No. 6,405,922, is for a “Keyboard signature security system.” It is very similar to the Kroll'174 patent, and similarly does not use mini-rhythms. Kroll '922 discusses subject ATM usage patterns that are germane to ATM devices only and are not a part of keystroke recognition at all. Kroll does mention both Flight Time and Dwell, where other prior art generally looks only at Flight Time. However, it does not utilize mini-rhythms. Also absent are measures of signature quality. Kroll retains all the problems of the prior art mentioned with respect to Garcia, Brown and Cho, and others. Kroll does add the idea of keeping track of “which” keyboard the subject is using and “adjusting” for differences between keyboards. While this is perhaps useful information for ATM devices in particular, it does not advance the state-of-art in keyboard recognition via typing patterns. Kroll is based on fundamentally flawed techniques for keystroke recognition. The other additions made by the Kroll method, such as “device location,” target ATM devices specifically, and do not address the typing recognition problem at all.