Generally, a phishing website has the following features. 1. Tricking a user by winnings (winnings from qq, winnings from microblogging, hitting a golden egg, the Avenue of Stars, etc.). 2. Tricking a user by a low price (of an Airline ticket, merchandise on Taobao). 3. A low production cost. The phishing website can be produced in batch and use a free sub domain name, and the cost of a phishing website can be negligible compared to the production and spread of a virus. 4. Serious consequences. The phishing website mostly tricks the user into purchasing merchandise (e.g. an Airline ticket, a single lens reflex camera, etc.) of a relatively high price, and some phishing websites can steal the user's Alipay and bank account, which will cause a great loss to the user.
The methods of the existing identification technology for a phishing website are mainly as follows. 1. By performing a character string matching of the key content of a webpage. For example, detecting whether there are words of ‘Taobao’, ‘winnings’, etc. in the title and keywords of the webpage. 2. By performing an image recognition. Some phishing websites imitate official websites of brands and the pages look just like the official websites, for example, imitate some airline companies and Taobao. 3. From domain name information. The phishing website usually uses a domain name registered relatively recently, and often uses a free sub domain name. The phishing website will be recognized by combining the several methods as mentioned above and finally a blacklist will be formed. In addition to the blacklist, in order to avoid a false alarm, there will be a mechanism of a whitelist, and a website which has been accessed by a large quantity and was once raised a false alarm will be added into the whitelist.
The recognition of the phishing website by the existing antivirus software is performed at a server: When a client accesses a website, the antivirus software sends a request to the server at the same time to inquire whether this website is a phishing website. If the website is a phishing website, it is intercepted, otherwise it is allowed. Such technical solution has two obvious drawbacks: one is that a newly generated phishing website is not recorded by the server, and all the results of the inquiries by a client are unknown; the other is that when a technical failure, etc. occurs at the server, which causes a longer inquiry time, there may be a missing report. With regards to the intercepting of a phishing website, the former drawback is very difficult to be avoided, because the production cost of a phishing website is very low, a free sub domain is often used, and after several users were tricked, the original domain name would be abandoned, the content such as the title would be amended and then a new domain name would be applied to continue to trick others.
One of the problems to be solve urgently by the present disclosure is to provide a method and an apparatus for determining a phishing website so that when the server cannot determine whether the website accessed by a client is a phishing website, or when the server fails, the latest generated phishing website can be intercepted the first time at the client, and most of the phishing websites can be intercepted, thereby ensuring the network security.