A green Internet access service, as implied by the name, provides clean and moral, secure, and civilized network environment and content for users who subscribe to this service. In this service, through content filtering or a content control technology, a user is forbidden or restricted from accessing illegal Internet content such as pornographic, violent, and reactionary content. The essence of green Internet access is to classify web pages and control access to different types of web pages by a user. Currently, there are mainly two solutions. One is to perform access control through a client or a browser, and the other is to control user's network access through a network device. Both solutions adopt a technology similar to a blacklist or a whitelist, which are as follows:
A basic principle of a blacklist technology is: A device provider provides a blacklist in advance. The blacklist includes not only an illegal uniform/universal resource locator (Uniform/Universal Resource Locator, URL) but also an illegal keyword. If a URL or a keyword included in an access request message of the user is in the blacklist, the access is forbidden. A specific processing procedure is that after obtaining the access request message of the user, extracting the URL and the keyword in the access request message of the user, querying the blacklist according to the extracted URL and keyword, and determining, according to a query result, whether the user may perform the access. Classification accuracy of a URL blacklist or a keyword blacklist obtained in the blacklist through a classification technology and performance of querying the blacklist are key points of this solution, and both have technical difficulties currently. The URL blacklist in a URL blacklist is generally provided by a device provider or a security vendor. A maintenance vendor of the URL blacklist is not a professional content provider, and lacks high accuracy of classifying legal URLs and illegal URLs. Therefore, a risk of a check error or a check omission exists. In addition, because matching of the URL and the keyword needs to be performed in real time, analysis and matching performed according to the blacklist need to consume many processing resources of an access control device. If a large number of users request to access a network in a short period of time, performing the access control reduces user's access speed and affects user's Internet access experience.
In view of a defect that a check error may occur in the blacklist technology, the prior art provides a whitelist technology. A basic principle of the whitelist technology is: An access control device provider pre-specifies a range of network resources that may be accessed by users, namely, a whitelist; if a URL that a user requests to access is included in the whitelist, the user is allowed to perform the access; if a URL that a user requests to access is not included in the whitelist, the access is forbidden. Similar to the blacklist technology, in the whitelist technology, matching also needs to be performed on the whitelist, which also consumes plenty of processing resources of the access control device.
During a process of implementing the present invention, the inventor finds that the prior art has at least the following problems:
Due to limitations of a processing capability and a storage capability of the access control device, access control technologies including the blacklist technology and the whitelist technology generally implement only web site-level control (in the blacklist technology, if a certain part of a URL extracted from an access request message of a user is consistent with a URL of an illegal web site in the URL blacklist, the access is forbidden; in the whitelist technology, if a certain part of a URL extracted from an access request message of a user is consistent with a URL of a legal web site in the URL whitelist, the access is allowed). For a case that a web site includes web pages of multiple control levels, for example, both legal web pages and illegal web pages, it is impractical to store URLs of multiple web pages on a same web site in a blacklist or a whitelist, and therefore, web page-level access control cannot be implemented. Therefore, in the prior art, there are some problems in granularity of access control, and it is difficult to implement fine-granularity access control.