The present invention is directed to a system and method for discriminating the actual origination of communication signals transmitted through a network interconnection. More specifically, the system and method are directed to the processing of communication signals received at a local, or reference, site through a network interconnection to determine and/or uniquely characterize the remote site origination of such communication signals. The system and method provide for this determination and/or unique characterization in a manner that is signal payload-, or data content-agnostic manner. They do so by, among other things, ascertaining the envelope characteristics of the communication signals in question and classifying based thereon the remote site from which the signals originated.
In certain embodiments and applications, the system and method provide for such classification of remote site origin in data content-agnostic manner for communication signals transmitted from certain websites remotely accessed by a local site through the internet, namely the world wide web. In these embodiments and applications, the system and method exploit the fact that signal transmissions in certain widely employed communication protocols pass the signals in packetized data segments. Various envelope characteristics are defined by the sequence(s) of packetized data segments transmitted to the local site during particular interconnected sessions. One or more characteristic signatures may be obtained according to these envelope characteristics, so as to uniquely characterize the particular remote site originating the data segments.
There are many instances where it is desirable to know what particular website or web-service originated certain communication signals that have been received by remote access over a network, even when the address of that website or service is dynamic or is obscured for example by NAT (network address translation) or proxy forwarding. In some instances it is desirable to recognize more specifically when a particular type of session is occurring over a network—say, the use of a particular web form, transfer of data from a particular application, or the occurrence of malicious software activities over the network. Where a site in question is uncooperative with the monitoring measures in place, is deliberately evasive to such monitoring, or is particularly sensitive to privacy concerns, the site may employ encryption measures in the given communication channel to make it difficult or impractical to determine its identity based on the payload of data. Even in cases where clear-channel data is readily accessible, it may become computationally challenging to store and process necessary data signatures when there are potentially many cases of interest. Thus, there is a need for a compact, fast, and minimally invasive approach to identifying remote sites such as websites or web-services accessed through a network.
Applications of such include detection for security purposes. These include monitoring of user activities for consistency with a business or government purpose without violation of their privately encrypted data. It is desirable to detect instances when users are redirected to malicious websites, masquerading as real commercial counterparts. It is also desirable to discover cases in which malicious software on a computer communications with outside entities; in particular, where such communications may be masked as benign web traffic such as web browsing, and where such outside servers may operate behind changing apparent web addresses on order to evade detection.