Global communications networks such as the Internet have evolved from an early research-based system with limited access to a truly world wide network with millions of users. The original Internet Protocol (IP) was designed on the basis that system users would connect to the network for strictly legitimate purposes. As a consequence, no particular consideration was given to security issues. In recent years, however, the incidence of malicious attacks on the Internet has grown to an alarming proportion. These attacks, which take on a variety of forms, often lead to a complete disruption of service for a targeted victim.
A DoS attack involves blocking somebody's ability to use a given service on the network. DoS attacks are common across the Internet with many being launched daily at various targets. One such attack is based on the concept of flooding a victim with so much traffic that the victim's server cannot cope, or with very effective malicious packets at lower rates.
Since identification of the source relies on the information provided by the sender itself, the Internet Protocol (IP) makes it extremely difficult to precisely identify the real source of any given datagram, and thus any given flow, if the source wishes to remain unknown. This peculiarity is often exploited, during a malicious Denial of Service (DoS) attack, to hide the source of the attack. Thus, if an attacker uses a spoofed source address—i.e. replaces its legitimate address with a different/illegitimate one—it is very difficult to trace the real source of the attack. It is expected that if attackers were open to identification the incidence of DoS attacks would decrease significantly. Mechanisms for tracing back anonymous network flows in autonomous systems are described in co-pending application filed Aug. 7, 2003 under Ser. No. 10/635,602 and entitled “Mechanism for Tracing Back Anonymous Network Flows” (Jones et al.). The contents of the earlier application are incorporated herein by reference.
The present application contemplates the use of covert channels to implement trace back functionality.
Covert channels are defined as “channels that use entities not normally viewed as data objects to transfer information from one subject to another.” Although a covert channel is generally regarded as a breach in the security of a system, it is possible to isolate certain applications in which covert channels can be used to the advantage of a network system. In the general arena of computer networks, or more specifically in the case of the “Internet”, a covert channel could provide, amongst other features, an efficient way to “mark” packets for a trace back solution.
Trace back is defined as the process by which a flow of packets is bound to its source, regardless of possible misleading efforts by the source to hide its location. In fact, due to their stateless nature packet switched networks, including the “Internet”, do not easily accommodate tracing, or recording the path of a flow across the network. The source field contained in each packet is meant to provide this information. In reality spoofing the information contained in this field is generally a trivial operation and a very common practice among malicious users.
Covert channels in telecommunications, and specifically in computer networks, are a well known topic while automated trace back techniques in packet switched networks are a more recent topic.
Some prior art trace back solutions make an implicit use of covert channels within the IP header to mark packets. These techniques include Probabilistic Packet Marking and Algebraic Approach together with all their variations. An article by D. X. Song, and A. Perrig entitled “Advanced and Authenticated Marking Schemes for IP Traceback”, IEEE Infocom 2001 provides greater details in connection with these schemes.
In general, regardless of the trace back mechanism adopted, the following marking schemes have been proposed using the IP header as the marking medium: i) a dedicated IP Option appended “in flight” (not a real covert channel) or ii) a semantic re-assignment of the 16-bit IP Identification field.
In the case of IP Options the main problem is that every marked packet will have its length increased during its journey. Packets already close in size to the Maximum Transfer Unit (MTU) of any given link on their path are likely to be fragmented if one, or more, IP Options field is added to them. On top of this, appending an IP Options field to an IP packet is a very expensive operation for a modern router; it usually cannot be carried out while the IP packet is in the “fast path” of the processing router, but requires the packet to be set aside and manipulated with special resources available only in the control plane (“slow path”). The packet will also generally be placed in the slow path of every subsequent router from this point on, since it carries an IP Option.
In the case of the IP Identification field the major problem is the semantic infringement leading to backward compatibility issues. The IP Identification field is used as a means to differentiate IP fragments that belong to different IP packets. If a packet is fragmented its identification field is replicated into each fragment so that the receiver can easily reassemble all the fragments into the original packet. Re-using the IP Identification field leads to two potentially dangerous scenarios as will be described next, together with the total loss of communication between any two legitimate hosts relying on this field.
Solutions based on assigning a new meaning to the 16-bit IP Identification field come in several flavors and often involve the use of hashing functions to overload the 16 available bits with more information at the cost of some conflicts.
The first one occurs when a packet is fragmented before it reaches any marking router. In this case, there is a chance that at every subsequent hop any of its fragments may be re-marked with trace back information. Thus, the receiver will fail to reassemble the original packet. The liabilities in this case are waste of bandwidth and network resources—spent to transmit all fragments of an IP packet that will never be reassembled together—and waste of buffering resources in the receiver, that will maintain all the fragments in vain, and potential connectivity loss.
The second scenario occurs when any IP packet is fragmented after it has been marked. In this case the trace back marking already present in the packet will be copied on all the fragments as a valid identification tag. A router is very likely to always mark packets with the same marking value. Since a receiver will buffer fragments of an IP packet until the whole IP packet is reconstructed, or for a certain interval, if any of the fragments does not reach the receiver it may be replaced by a fragment from a successive IP packet that was fragmented later on by the same marking router.
There is, therefore, a need to improve trace back efficiency in communication systems such as the Internet.