Protecting communication privacy is an important issue for all types of electronic communication, especially when the communication data are sent over a large network, such as the Internet, where an adverse party can easily intercept the communication data. The recent rise of the World Wide Web on the Internet has triggered serious concerns about the possible threats to privacy associated with Web browsing. The browsing user's location or other types of personal information may be inadvertently disclosed if the communication data traffic is intercepted by an adverse observer. Even partial revelation of such information can cause embarrassment and/or financial detriments or even compromise safety.
One particular type of browsing-related sensitive information to be protected is the fact that the browsing user is accessing a particular Web site or Web page. For instance, a user found to be browsing Web pages containing certain types of medical or financial information may inadvertently reveal, through implied interest in that information, embarrassing or confidential financial information about himself. As another example, a user may reveal that he is out of town, thereby making his home vulnerable to burglary, simply by accessing a private home security Web server from abroad. An adverse observer need only notice that the home security Web server is being accessed, and that the originating IP address of the HTTP request is not in the same locale as the home/server, which is usually an easy thing to determine by one skilled in sniffing Internet traffic. An inference can thus be made that the resident (the most likely browsing user of the private Web server) will not return home soon.
To protect the privacy of Web browsing, a considerable amount of research has been directed at developing techniques for “anonymizing” Web browsing traffic so as to hide the connection between a particular user and the Web pages he or she is accessing. Conventionally, most proposed measures for protecting Web traffic anonymity have focused on two main tools: data encryption and the use of one or more intermediate proxies. Data encryption is applied to communication data to hide information that might reveal either the identity of the user or the content of the Web page. Intermediate proxies are used to hide from any particular routing node or an eavesdropper on the network the connection between the browsing user's network address and the Web site's address.
Even with the combination of data encryption and using intermediate proxies, Web traffic anonymity is still not guaranteed. Generally, even when multiple proxies are used, the first link on the routing chain (i.e., the link between the user and the first proxy) is the most vulnerable to attack, since an attacker (which may be the first proxy itself, the user's ISP, or perhaps an eavesdropper, especially on a wireless link) can immediately determine the user's network address. To prevent privacy attacks in such a case, data encryption is essential.
A critical question is, however, how effective the encryption of Web traffic is for hiding the source (e.g., a Web site) of the traffic from the attacker. Prior to the present invention, there has been no meaningful way to evaluate whether the encrypted Web traffic is vulnerable to privacy attacks that attempt to identify the source of the Web traffic. A related question, which can only be answered based on an understanding of the answer to the first question, is which countermeasures may be effectively used to make it more difficult for an adverse party from reliably identifying the source of the encrypted Web traffic. These questions remained largely unanswered until the present invention.