“Phishing” refers to attempting to fraudulently acquire sensitive information, such as passwords and credit card details, by spoofing web pages of legitimate organizations. Phishing usually involves reproducing in “look and feel” a legitimate web page—such as that of a bank—on a server under the control of an attacker. The victim enters personal information on the webpage, believing the webpage to be trusted, and that information is harvested by the attacker for fraudulent use. Financial organizations and their current or potential customers may lose billions of dollars as phishers perfect schemes of obtaining personal information. Some of these schemes include international domain name spoofing, open URL redirectors, and embedded frames, all of which can make a phishing website visually indistinguishable from a legitimate one. While phishing communications can be sent indiscriminately in the hopes that a recipient will respond, recent more sophisticated phishing techniques have involved targeting individual recipients; for example, sending customers of a particular bank a webpage that looks like the webpage of the bank that they use. This targeted phishing technique is known as “spear phishing”, and of course is much more effective than indiscriminate phishing.
Current phishing prevention methods are neither complete nor sufficiently responsive. Many phishing websites either go unnoticed completely or at least for a long enough period of time to do considerable damage. A major weakness in current phishing prevention methods is their dependence on user reports and manual updates. Many methods also require client-side software, which implicitly assumes—usually incorrectly—that end-users are aware of the dangers of phishing.
Many schemes to detect and prevent phishing have been proposed over the past few years, yet phishing-related losses incurred by victim organizations and their customers or potential customers continue to mount. A recent report estimates over 2.8 billion dollars of losses from phishing attacks in 2006 alone. The rise in the frequency and financial cost of phishing attacks results from both the increasing cleverness of phishers, (e.g., targeting more organizations, using sophisticated techniques to make the phishing website appear legitimate), and the ignorance of the typical Internet user concerning the dangers of phishing. Phishers are able to reach a larger population of end-users through spam email, instant messages, messages purported to be from friends (“spear phishing”), etc., where many users may not even be aware of the threat of phishing.
Current proposals to detect and mitigate phishing fall into three classes: 1) client-side solutions; 2) client-server solutions; and 3) server-side solutions. “Client-side solutions” are the most common among anti-phishing tactics. Many of these solutions are available as Web browser “plug-ins”. Current methods include a) prominently alerting a user that a URL domain is fraudulent, b) performing various checks on the URL and the content of a Web page and raising alarms on suspicious web accesses, and c) modifying a password that is submitted to a site as a function of the site's domain name so that the password submitted to a phishing site is not the same as the one submitted to the legitimate site. Tools that analyze email and detect potential phishing URLs (e.g., by raising an alert for HTML email if the text and the URL for a hyperlink do not match) are also client-side solutions.
“Client-server solutions” include all techniques that involve a database back-end, such as a URL blacklist, are also quite common. This solution type also includes “blackbox” security appliances sold by vendors that are used to filter URLs. Additionally, external reputation sources, such as search engines from which information about URLs are retrieved, also belong to this category. Studies show that although many blacklist-based schemes have high accuracy rates, they are lacking in that they are neither complete in their listing of phishing URLs, nor are they sufficiently responsive in the timely detection of phishing URLs.
“Server-side solutions” include attempts by the targeted organization to increase user awareness of phishing, such as placing a visual cue system on valid Internet web pages. Such methods often fail as users are less likely to notice the absence of a visual indicator than its presence.
Some limitations of the prior art can be seen by examining the construction of “blacklists”, a common anti-phishing technique. Blacklist construction of URLs is commonly made from user reports; therefore, it is an inherently reactive approach. Dependence on user reports makes a timely response to phishing attacks challenging, if not impossible. Before a phishing URL is reported, verified, and listed, many users may have already fallen prey to the attack. Many of these blacklists also list the complete URL of the phishing site, allowing even minor variants of the same phishing URL to escape listing. Additionally, phishers could poison the blacklist by submitting valid domain names to the blacklist. Worse yet, even if current techniques could be improved, they are inherently client-based. They require users to install toolbars and other client software and act appropriately in response to warnings. Unfortunately, most users do not realize the need for phishing protection, and do not install client-based anti-phishing tools.
By analyzing the current phishing prevention landscape, at least four limitations may be identified. These limitations are: 1) a lack of responsiveness, 2) a lack of completeness, 3) a need for end-user cooperation to install and use the anti-phishing technique, and 4) a reaction-based approach primarily utilizing historical data to fight phishing attacks.
Lack of responsiveness comes from the slow detection of new phishing URLs. Historical data has shown that the half-life of a phishing site is of the order of a week or less. Therefore, the inherent delay between when a phishing website is established until sufficient user feedback is received to validate adding the site to a blacklist of known phishing URLs makes it difficult to properly guard against phishing attacks.
Lack of blacklist completeness comes from a lack of end-user feedback regarding phishing attacks, since blacklists are generated from such feedback. This lack of completeness of blacklists is due to a lack of end-user awareness or appreciation of the phishing problem.
Lack of end-user awareness also leads to a lack of end-user participation in utilizing anti-phishing techniques. The end-users will be slow in loading anti-phishing software on their computers and will usually not pay proper attention to anti-phishing warnings.
Lack of a proactive approach and reliance on historical data to fight phishing attacks leads to significant periods of time between when the phishing website is established until it placed onto a blacklist. This period of time gives phishers ample time to cause significant damage.
Accordingly, what is needed is a method that addresses these limitations in protecting against phishing.