There are many different entities—financial, business, government, charity, educational, individual, etc. —that may choose to have online presences implemented by computer systems coupled to a network or computer program code running on systems of other entities that are connected to the network. Since these online systems can be used to provide information, accept and forward information, facilitate transactions, and/or allow access to online resources, those entities have an interest in securing those systems so that authorized activities are allowed while unauthorized activities are prevented. Internet and other online facilities are commonly used for financial, business, private and other transactions preferably kept secure.
In a simple example, a bank may choose to provide its customers with online access to banking details and a facility to initiate transactions, such as funds transfers. Some illegitimate actions that unauthorized individuals or computer systems may wish to perform might be expected, such as improperly accessing the banking details, initiating unauthorized transactions, or modifying online resources for their own goals rather than those of the operator of the resources, such as defacing an online presence; stealing money, goods or information; sabotage; or performing other illegitimate actions. Other illegitimate actions might be unexpected.
As explained herein, a common approach to providing this online presence is via a “website”. While users may consider a website a “place”, it is often a logical place only, in that it is referenced by a URI, while its actual physical location is not important and may indeed be distributed over multiple data centers or even virtual data centers in computing clouds. More precisely, a website is typically the user interface aspects of an entity's network presence.
For example, a retailer might set up a server that has thereon software that can receive requests from a network and respond to those requests by returning content, accepting inputs and/or performing some actions in response to requests. Some of that content returned can be in the form of web pages viewable by client devices in response to requests for those web pages from those client devices. Client devices might include computers, telephones, smart handheld devices, other computing devices, etc. These client devices might be used by the retailer's customers, potential customers, visitors, suppliers, or partners.
Some web pages are static and pre-generated in advance of a request, such as a page explaining a company's history, while others are dynamic and generated on the fly, such as a web page showing a user's current shopping cart contents or a page generated for a product that a user just requested. Thus, the server might have access to data systems usable for generating web pages and other content (video, music, etc.). The server might comprise multiple machines at different locations on the network, perhaps serving different sets of pages or not. Thus, the term “website” can refer to the client-side view of a collection of servers, content, operations and the like, while end users might view a website as a collection of pages operated by an entity with a consistent approach that can be viewed in various aspects. As used herein, “website” might refer to the content, the servers, the operators of the servers, and/or the interaction with client devices, etc., depending on context.
As website developers have devised defensive methods to detect and thwart attacks, the attackers have in turn devised ways around those defenses, in a co-evolving cycle of increasing sophistication.
Many methods have been devised to steal legitimate users' identities for website abuses. A common method is called “phishing”, wherein an email sent under the guise of a trustworthy entity elicits personal information from unwitting recipients, typically by luring potential victims to a fraudulent website that requests identifying personal information such as usernames, passwords, account numbers, ATM PINs, etc. This stolen information is then used by impostors, either manually or robotically, to log in to the victims' accounts on the genuine websites in order to steal money, send forged emails, or perpetrate other illicit activity.
To combat such impostors, many website operators have developed more-sophisticated access-control methods that require secondary authentication information that simple phishing schemes cannot easily obtain. For example, when a website suspects that an account is being used by a third party, the website may verify that the user is indeed the owner of the account by demanding randomly chosen additional access credentials such as place of birth, mother's maiden name, or the answer to one of a set of questions preselected by the legitimate account-owner.
In response to the deployment of secondary authentication techniques, fraudsters have developed what is called a “man-in-the-middle attack”, in which a phisher lures a victim to a counterfeit website mimicking the appearance and behavior of the target site, on the one hand intercepting the victim's input and relaying it to the real website, while on the other hand intercepting the real website's output and relaying it back to the user through the bogus site. Thus, man-in-the-middle attacks permit fraudsters to gain entry into privileged sites by duping authorized users of the site into responding to all authorization challenges posed by the privileged sites, thus evading all direct authorization protocols. Despite the name “man in the middle”, the entire process, including any illicit activity perpetrated from within the burgled account, may be performed fully automatically, without the need for human intervention.
To combat man-in-the-middle attacks, many websites are programmed to look at structural identifying information, such as the users' Internet Protocol addresses and inferred geographic locations, “cookies” (site-generated tokens passed back and forth between site and client), user-agent identifiers, and request timestamps—information over which the fraudster ordinarily has no direct control. This ancillary information allows a website to detect suspicious users who, despite meeting all explicit authorization challenges, are evidently not using the same browsers on the same computers in the same locations as they usually do, indicating that they may be victims of man-in-the-middle attacks.
Now that websites are examining structural session information to distinguish impostors from legitimate users, fraudsters have developed an even more sophisticated method of assault, called a “man-in-the-browser attack”, using malicious software surreptitiously installed on potential victims' own computers. Many mechanisms have been devised for getting the malware installed, including attachments to phishing emails, downloads from phishing sites, and self-propagating viruses and worms; any of which may be disguised within Trojan horses that apparently or actually perform desirable functions, or may be downloaded afterwards through a back door via a bootstrapping mechanism.
This malware, typically in the form of a browser plug-in (hence the name), lurks in the background until it recognizes that the potential victim has successfully signed in to a targeted website, thus eluding all direct authorization protocols. It then uses the victim's own browser on the victim's own computer in accordance with the user's own schedule to perpetrate fraud while the victim is also interacting with the website, thereby also eluding all structural authentication clues. Again, although some implementations provide for real-time human intervention, nevertheless the entire process, including any illicit activity perpetrated from within the hijacked account, may be performed fully automatically, despite the name “man” in the browser. The malware can elude detection by the user by performing its transactions invisibly, for example in an offscreen window, or, as in a man-in-the-middle attack, by intercepting the communications between the real user and the website, and spoofing the view presented to the user.
Since man-in-the-browser attacks, like man-in-the-middle attacks and other phishing attacks, cause substantial harm to websites and to the websites' legitimate users through direct financial and material theft as well as through sabotage, defamation, and other forms of damage, it is crucial for websites to have an effective means for detecting such attacks in order to take remedial actions against them.
At present, however, no methods exist for websites to detect man-in-the-browser attacks.
Many websites outsource some of their services to third-party websites specializing in those services, such as advertising, news, mapping, searching, indexing, categorization, tagging, ratings, reviews, email, chat, social networking, forums, social games, collaborative editing, questionnaires, polls, media hosting, special deals and promotions, purchasing, bill-paying, banking, wire transfers, and identity verification. Although these third-party services may be tailored, customized, and integrated so as to appear to be offered directly by the primary website, clients using these services are actually diverted to the corresponding partner websites, bypassing the web servers of the primary website. As a result, the host website loses all track of clients while they are dealing with the third parties, leaving it susceptible to attack through a partner website or a combination of partner sites and the host site. The primary website thus has to depend on its partner websites to monitor its clients in its stead. However, the monitoring information provided by third-party services, typically in the form of daily, weekly, or monthly logs or digests, is generally inadequate and untimely. Online criminals have been quick to take advantage of this weakness, so that many websites now incur their greatest losses indirectly, through third-party services, and urgently need an effective means for tracking users across third-party websites in addition to on their own websites.