User authentication is one of the most important topics in computer and network security research. Modern security mechanisms that aim to prevent system misuse, data leakage, and enforce security polices, are fundamentally built upon a system of trust—trust that the individuals who supplied the credentials are the same ones to whom those credentials were assigned and that the users are not overstepping the bounds of the roles their credentials represent. These are the fundamental problems of insider threat detection.
While user management is a well-understood discipline, it nevertheless remains as the main vector of actual system penetrations observed in practice. System-level attack vectors such as software exploits, while harder to prevent, are comparatively rare, and attacking the vulnerabilities inherent in the human element of the network has proven to be a more reliable method of entry in practice. These attacks do not leverage network policy shortcomings, nor do they exploit unknown software vulnerabilities; they attack the lowest hanging fruit of the network, which are often the non-security-conscious users. Spear-phishing attack vectors, such as those used by the recent New York Times hacking incident remain a serious threat. Further, these attackers may not always originate from external sources; insider threats, where co-workers steal or otherwise misuse each other's credentials and obtain information that they should not have access to, are equally troubling.
Typically, once a user's credentials have been compromised, the incident evolves beyond the stage of defense-penetration, and into one of persistent threat, which is arguably much more difficult to detect and contain. It is within this stage that the most damage is often done to the system: backdoors are installed, files are exfiltrated, additional user accounts are compromised, and more system resources are hijacked, to be used for additional exploits. This stage is often very dynamic and can last anywhere between minutes to years, and how far the attacker reaches in this stage often determines whether the incident is a benign lesson in security discipline or a catastrophic loss to the enterprise, costing the target entity considerable financial and reputation damage.
Behavioral biometrics can provide user authentication in a non-intrusive fashion to the user. Several approaches have been proposed to authenticate users at the beginning of a user session. Most were based on modeling mouse or keystroke dynamics, either alone [1], [2], [3], [4] or in conjunction with another authentication mechanism [5]. Other behavioral biometrics work investigated the attribution of a user session to a given user once the session is completed. Goldring modeled process information extracted from the process table and successive window titles to profile user behavior during an entire user session [6]. Several studies modeled sequences of user command data [7], [8], [9].
However, little work has focused on behavioral biometrics as active authentication mechanisms throughout the entire user session. Keystroke and mouse dynamics have emerged as the main continuous authentication approaches, as they do not require any specialized equipment or additional hardware sensors. They verify computer users periodically on the basis of typing or mouse use styles. Various modeling approaches have been proposed with varying accuracy results.
Messerman et al. presented a non-intrusive continuous authentication mechanism based on free-text keystroke dynamics [10]. They used two-class modeling for profiling user keystroke dynamics behavior. Authentication of the target user is performed by scoring the user's activity against a constant number of users, but not the entire user space, to improve performance. The experiments and results were limited to the use of one application only, namely Webmail. Most users multi-task, and therefore it is important to model mouse and keystroke dynamics across various applications.
In [11], Shen, Cai and Guan proposed on a continuous authentication mechanism based solely on user mouse dynamics patterns. They distinguished between two types of mouse behavior: frequent segments of mouse dynamics, which they referred to as patterns, and the less frequent segments, referred as holistic behavior. Patterns are classified as “micro-habitual” or “task-intended”. The former are patterns that characterize a user's unconscious habits, such as repeatedly refreshing a screen with no real need or purpose. The task-intended patterns describe user mouse actions that are dependent on the application being used, such as opening a document from that application. They found that “patterns” are more descriptive of user behavior as they are stable features across user sessions. The same patterns emerged as discriminative features. All one-class classifiers trained using mouse activity patterns as features performed better than classifiers modeling the user's holistic behavior.
Pusara and Brodley built C5.0 decision trees on the basis of users' mouse movements within a time window of configurable size, and used the models to re-authenticate users [12]. The data was collected in a free environment, i.e. from the users' own computers. But the user sample was too limited to report generalizable results. The user mouse movements models, which were trained using data from all 11 users, achieved an average false-acceptance rate (FAR) of 1.75% and average false-rejection rate (FRR) of 0.43%, but the verification time took up to 15 minutes depending on the window size.
Jagadeesan and Hsiao reported that combining keystroke and mouse dynamics reduced accuracy results, as opposed to using one of the two approaches only for continuous authentication [13]. Their experiments, however, involved only a limited set of users (5 users in each experiment.)
Although some work has reported promising results, authentication using mouse dynamics and free-test keystroke dynamics remain immature authentication approaches. They were tested within limited and pre-defined settings (working on a specific task or interacting with one application), and therefore have not dealt with the intrinsic behavioral variability as the user interactions with various applications, or multi-tasks. We do not run our experiments in a controlled environment dependent on specific software application or hardware device. Instead, we monitor user high-level actions as they interact with their own computers and perform their daily business activities. Furthermore, our approach is less vulnerable to changes in user behavior due to physiological factors such as pain or injury, which might affect keyboard or mouse dynamics.