Every day, millions of computer users rely on computer networks such as the Internet for important information and for entertainment. Using the Internet is not without risk, however. For example, protecting personal computers against a never-ending onslaught of “pestware” such as viruses, Trojan horses, spyware, adware, and downloaders has become vitally important to computer users. To many parents, the proliferation of Internet pornography has become a grave concern.
One solution to the problem of harmful or undesirable network content is content filtering. Content filtering typically involves identifying network addresses associated with suspect network destinations in real time and warning a user of the possible threat or blocking the suspect network destinations before the harmful or undesirable content is accessed. Such content filtering may be implemented, for example, at the network level in an Internet gateway or in a client application such as a Web browser.
One significant challenge in performing content filtering is that the Uniform Resource Locators (URLs) associated with suspect network destinations tend to be changed frequently. In many cases, the name of a particular file (e.g., a pestware executable) on the Internet remains the same, but the server portion of the path containing the primary domain and any subdomains is changed or is rearranged in order. The dynamic nature of these URLs renders ineffective a content filtering methodology that relies on exact URL string comparisons.
It is thus apparent that there is a need in the art for an improved method and system for identifying network addresses associated with suspect network destinations.