Embodiments of the present invention relate to computer security, and more particular relate to techniques for fraud monitoring and detection using application fingerprinting.
With the growth of the Internet, an ever increasing number of businesses and individuals are conducting transactions online. Many transactions, such as banking-related transactions, shopping-related transactions, and the like, require sensitive information (e.g., authentication information, financial information, etc.) to be transmitted between, and stored on, end-user and service provider computers. This widespread use of sensitive information for online transactions has lead to the proliferation of identity theft, various types of hacking attacks, and online fraud.
In a typical transaction between a user and an online service provider, the user submits one or more requests comprising user-entered data to the service provider's application/system. If the transaction is in a “pre-authentication” phase (i.e., the user has not yet been authenticated), the one or more requests may include authentication information, such as a username, password, personal identification number (PIN), and/or the like, that is used to by the service provider to verify the user's identity. If the transaction is in a “post-authentication” phase (i.e., the user has already been authenticated), the request may include transaction information, such as a credit card number, address, and/or other data, that is used to carry out the transaction. FIG. 1 illustrates an exemplary system 10 that may be used for submitting user requests to a service provider application. In this example, the user of system 10 enters authentication information (e.g., a user ID and password) via keyboard 12 into a user interface 18. User interface 18 is displayed on a display 16 of user computer device 14.
Prior art systems implement a number of safeguards for protecting the information that is transmitted from a user computer to a service provider application (typically running on a remote server). For example, the widely-used TCP/IP communication protocol includes security protocols built on the secure socket layer (SSL) protocol to allow secure data transfer using encrypted data streams. SSL offers encryption, source authentication, and data integrity as a means for protecting information exchanged over insecure, public networks. Accordingly, many service provider servers and applications use SSL, or similar security protocols, to exchange data between remote servers and local user systems.
Despite these known precautions, a user's sensitive information (e.g., authentication information, transaction information, etc.) remains vulnerable between its entry by the user and its encryption prior to remote transmission. In addition, sensitive information sent from a service provider application is vulnerable during the period after its decryption at a user's computer and until its display. This information can be surreptitiously captured by fraudsters/hackers in a number of ways. For example, cookie hijackers may be used to copy sensitive information from web browser cookies. Further, keyboard loggers and mouse click loggers may be used to intercept and copy mouse clicks and/or depressed keys after user entry but before processing by a web browser or other software.
Even graphical user interfaces that represent on-screen keypads and keyboards with selectable graphics for user entry (instead of, or in addition to, providing fields for text entry) are vulnerable to mouse click loggers, screen capture loggers, and the like. FIGS. 1-3 illustrate prior art examples of such interfaces. As shown, each alphanumeric character in the graphical interfaces is represented by a unique graphical image (e.g., the pixels forming the number “1”). Screen capture loggers can use optical character recognition (OCR) technology to decipher characters selected by mouse clicks and the corresponding alphanumeric graphics in order to ascertain the actual alphanumeric text characters of a user's ID and/or password. In addition, sophisticated screen capture loggers can use checksum and/or size characteristics of the graphic images in order to ascertain the data item corresponding to a particular graphic image selected by a user's mouse click during data entry. In these ways, screen capture loggers can acquire the sensitive information even when the graphical user interface has, for example, rearranged the order of alphanumeric characters on the graphical keypad or keyboard.
Further, once a fraudster/hacker has successful stolen the authentication credentials of a legitimate user and logged into a service provider application as the user, the fraudster/hacker is generally free to submit whatever data he/she desires to carry out fraudulent transactions or exploit application-level vulnerabilities in the application. For example, in a common type of online attack known as SQL injection, a fraudster/hacker can exploit faulty error-handling or other flaws in the input field processing of a web-based application to submit application data that contains malicious SQL statements. These embedded SQL statements are executed by the application's backend database, enabling the fraudster/hacker to modify, delete, and/or view any data in the system.
One known method for preventing SQL injection and other such data-driven attacks involves the use of predefined signatures. In this method, a system administrator or other entity will identify a list of data strings that are known to be malicious (such as SQL commands) and store signatures of those data strings in the system. In addition, a process will monitor the network streams between user computers and the service provider application (e.g., HTTP traffic) and compare the data transferred via those streams with the predefined signatures. If a particular piece of data originating from a user computer matches one of the predefined signatures, the submission is determined to be malicious or fraudulent.
Unfortunately, the above approach is problematic for several reasons. First, since this approach can only detect predefined data strings, it is likely that this approach will be unable to detect all of the different permutations of data that may be used in a data-driven attack. Second, the predefined list must be updated manually as new types of malicious data are discovered, leading to increased operation and maintenance costs. Third, this approach is unable to properly classify data that may or may not be fraudulent or malicious based on the context in which it is submitted. For example, a request to wire transfer a dollar amount in excess of $10,000 may be fraudulent if submitted from a user account that does not have a history of transferring such large amounts, but may not be fraudulent if submitted from a user account that is regularly used to transfer $10,000 or more.