In communication systems where the user terminals and/or users are mobile, preventing the unauthorised tracking of users and equipment is important for privacy and possibly legal reasons. The main challenge in preventing tracking is to avoid the use of long-term or easy-to-correlate information (such as identifiers, data or other values) that constitutes explicit “identifiers” or otherwise allows users to be identified, and that make it possible to follow the same entity as it moves from one place to another (where the “place” may be geographical, i.e. physical, or logical, e.g. a network address). The term “identifier” as used below encompasses all possibilities. Here “place” can be both physical (geographical) or logical (e.g. network address). Some telecommunications mechanisms take this into account, and can use frequently and/or randomly changing identifiers. In GSM, the so-called TIMSI, Temporary IMSI (International Mobile Subscriber Identifier), is used to hide the true IMSI. However, in general such techniques are not useful unless they are enforced throughout the protocol stack. For instance, while wireless LAN authentication mechanisms can employ ‘pseudonyms’ [EAP-SIM, IETF draft-haverinen-pppext-eap-sim-14.txt; and EAP-AKA, IETF draft-arkko-pppext-eap-aka-14.txt] or even completely hide the authentication exchange from others [PEAP, IETF draft-josefsson-pppext-eap-tis-eap-10.txt], this is of little value as long as fixed link layer identifiers (e.g. MAC addresses) are used at a lower layer.
The problem exists in many forms. A particularly visible example is the transmission of cleartext, human-readable user identities such as NAIs [IETF RFC 2486]. Similar problems appear for the transmission of stable but “meaningless” identifiers such as IP addresses [PRIVACYADDR; IETF RFC 3041]. A less known problem is that even data that is completely independent of any real “identifier” can be used to track users. For instance, an IPSec SPI [IPSEC, IETF RFC 2401] can reveal that a node in one place is the same node as a node that appears later in another location, if the SPI value has not changed even though the IP addresses are no longer the same; e.g. with a 32-bit SPI, the chance is about 1 in 4 billion that it is not the same user if the SPI:s are the same. (IP addresses can change if NAT-T or MOBIKE are used.) This is particularly problematic for IKE SPIs, as there is no possibility for efficiently renegotiating IKE SPIs without revealing the previous SPIs in the process. For IPSec SPIs this is less of a problem, as the SPIs can be re-negotiated within the protection of the IKE SA, hence hiding the change from outsiders. Nonetheless, the problem remains that privacy enhancing measures can sometimes be defeated by unexpected factors.
The same problem arises in certain authentication mechanisms. For authentication purposes, two popular techniques are the use of public key cryptography and so-called hash chains. The problem with public keys is that the key, even if not tied to an identity, leaves “traces” of the user, since anybody can verify authenticity using the public key. Similarly, a hash chain is easily linkable in the forward direction by applying the hash.
Even data that changes for every packet can be used to track users. For instance, TCP or IPSec sequence numbers may in some cases be sufficient for the identification of equipment even if no other stable identifiers are present. As long as the sequence number space is sufficiently large and nodes distributed along to a sufficient degree, a node that presents a sequence number N in one place and N+1 (or something close to it) in another place shortly thereafter is likely to be the same node.
Existing techniques to deal with these problems include:                Hiding identifiers and other communications inside a protected tunnel or tunnels, such as TLS or IPSec. The drawback of this solution is that often other identifiers still remain visible outside the “tunnel”.        Using “pseudonyms”, as is done in GSM and some EAP methods. In this technique, an identifier is used for login to a service, and the service returns an encrypted token that the client can decrypt and use as the identifier for logging into the service the next time. A drawback of this scheme is that the new pseudonym has to be returned, which adds to the amount of signalling necessary. In any case, this solution may not be possible in all situations. For instance, the protection of sequence numbers in this manner would be possible in TCP as there are ACKs, but would be hard in IPSec because there may not be traffic in the return direction before a new packet needs to be sent. In any case, waiting for the new pseudonym before a second packet can be sent is inefficient.        Removing sequence numbers (and thereby linkability) may be considered where these are conventionally used. However, with present art this is not a universally viable option, as it creates a sender/receiver synchronisation problem, at least when used with unreliable data transport mechanisms such as IP.        For public keys and hash chains, an available method to improve privacy is to frequently generate new public keys/hash chains. However, this is computationally quite expensive.        
To summarise the problem, metadata descriptive of the processing of data packets, e.g. security processing, may be used to attack privacy.