This document contains a microfiche appendix consisting of 2 sheets of microfiche and a total of 161 frames.
A portion of the disclosure of this patent document contains material that is subject to copyright protection. The copyright owner has no objection to the facsimile reproduction by anyone of the patent disclosure, as it appears in the Patent and Trademark Office patent files or records, but otherwise reserves all copyright rights whatsoever.
1. Field of the Invention
The present invention relates in general to packet switched telecommunications networks and more particularly to a system for allowing user-space modification of packets transmitted through a network.
2. Description of the Related Art
In a packet switched network, a message to be sent is divided into blocks, or data packets, of fixed or variable length. The packets are then sent individually over the network through multiple switches or nodes and then reassembled at a final destination before being delivered to a target device or end user. To ensure proper transmission and re-assembly of the blocks of data at the receiving end, various control data, such as sequence and verification information, is typically appended to each packet in the form of a packet header. At the receiving end, the packets are then reassembled and the message is transmitted to the end user in a format compatible with the user""s equipment.
As is well known in the art, most packet switched networks operate according to a set of established protocol layers, collectively defining a protocol stack. Each layer of the protocol stack exists to perform a specific function, such as addressing, routing, framing and physical transmission of packets. When a data packet is to be transmitted over a network from a source machine to a destination machine, the packet will pass in a downward direction through layers of the protocol stack on the source machine, and in an upward direction through corresponding layers of the protocol stack on the destination machine.
Each layer of the protocol stack in the transmitting process may add a respective header to the packet, which provides information to the corresponding layer in a receiving process. Thus, as a packet passes down through the protocol stack on a transmitting machine, the packet may gain an additional header at each layer. At the bottom of the stack, the transmitting process may then frame the data and physically transmit it over the network toward its destination. When the packet reaches its destination, the packet will then pass up through the protocol stack. Each layer of the stack in the receiving process may obtain useful information from its associated header and will strip its header from the packet before passing the packet up to the next layer for processing. At the top of the stack, the packet may then be processed by an application or user program.
The layers at the top of a protocol stack are typically designed to provide end-to-end communication between source and destination machines on the network. For instance, the top layers may provide packet sequence information and encryption. Lower layers of the stack, on the other hand, often provide information to facilitate communication between immediately adjacent machines in the network. For instance, the lower layers in the stack may provide network address information, for use in routing packets through the network.
A variety of packet switching protocols are known. These protocols include, for instance, TCP/IP, Novell""s SPX/IPX, Apple Computer""s Appletalk, and Microsoft""s NetBEUI. Of these protocols, the best known is the TCP/IP suite, which is used to manage transmission of packets throughout the Internet and other IP networks. For purposes of illustration, but without limitation, the present invention will be described with reference to the TCP/IP suite.
The TCP/IP protocol stack includes, from lowest to highest, a link layer, a network or xe2x80x9cIPxe2x80x9d layer, a transport layer and an application layer. The link layer includes network interface card drivers to connect the machine to the physical network, such as an Ethernet network. The IP layer provides addressing information to facilitate independent routing of packets within or between networks and also includes other control layers, such as an xe2x80x9cICMPxe2x80x9d (Internet Control Message Protocol) layer and an xe2x80x9cARPxe2x80x9d (Address Resolution Protocol) layer. The transport layer allows source and destination machines to carry on a conversation with each other and includes a connection-oriented xe2x80x9cTCPxe2x80x9d (Transmission Control Protocol) layer and a connectionless xe2x80x9cUDPxe2x80x9d (User Datagram Protocol) layer. Finally, the application layer includes application programs that carry out the functionality of a network device and interface with a user.
In general, the machines that implement the protocol stack in a packet switched network (including, without limitation, source machines, destination machines, packet switches and routers) are computers. Each of these computers includes a processor, a memory, and an input/output port, and is managed by an operating system.
As is known in the art, the operating system of a computer typically distinguishes between two types of code: kernel code, and application code. Kernel code is the core of the operating system, handling matters such as process scheduling, memory management, hardware communication and network traffic processing. Application code, on the other hand, is the code used by applications, such as word processors, spreadsheets, games and compilers. In operation, kernel code and application code are stored in separate portions of memory and are each executed by the computer processor (or multiple processors). Thus, kernel code is said to be running in xe2x80x9ckernel space,xe2x80x9d and application code is said to be running in xe2x80x9cuser space.xe2x80x9d Applications may, however, use the kernel to access system resources and hardware through system calls, and are therefore thought of as running above, or on top of, the kernel.
In a typical network-capable computer, part of the protocol stack is implemented in kernel space and part is implemented in user space. For reference, the part that is implemented in kernel space may be referred to as the xe2x80x9ckernel stackxe2x80x9d (carried out by xe2x80x9ckernel stack codexe2x80x9d), and the part that is implemented in user space may be referred to as the xe2x80x9capplication stackxe2x80x9d (carried out by xe2x80x9capplication stack codexe2x80x9d). Considering the TCP/IP protocol suite, for instance, the link, network and transport layers are each implemented by kernel stack code running in kernel space, and the application layer is implemented by application stack code running in user space. FIG. 1 illustrates this arrangement by way of example.
When a packet passes between the application and transport layers of the TCP/IP protocol stack, the packet moves between user space and kernel space. Since user space and kernel space are separate areas of memory, however, the process of moving a packet typically includes copying the packet to the destination area and then deleting the original. Thus, in practice, once an incoming packet reaches the top of the kernel protocol stack, it is copied to user space to be processed by the application layer of the stack, and it is then deleted from kernel space. Similarly, once an outgoing packet has been processed by the application layer in user space, it is copied to kernel space to be processed by the remainder of the protocol stack, and it is then deleted from user space.
In general, when an incoming packet enters a computer or other hardware device running a protocol stack, the destination of the packet may be some specific code within the kernel, or it may be an application running in the application layer. In any event, the packet will typically be processed by multiple layers of the protocol stack finally arriving at its destination. Similarly, an outgoing packet will typically be processed by multiple layers of the protocol stack before being transmitted onto the network.
Referring to FIG. 1, for instance, assume that an incoming UDP packet arrives at a destination machine for receipt and processing by an application. When the UDP packet arrives, the Ethernet (link) layer will detect that a type of IP packet has arrived, will strip the link layer header from the packet, and will pass the packet to the IP layer. The IP layer will then determine that the packet is a UDP packet destined for an application on the machine, and will strip the IP header from the packet and pass the packet to the UDP layer. In turn, the UDP layer will determine which application is to receive the packet, will strip the UDP header from the packet, and will pass the packet to that destination application.
As another example, a similar set of events occurs when an incoming packet arrives for processing within the kernel, such as for routing or echo processing. With reference to FIG. 1 again, consider the xe2x80x9cpingxe2x80x9d program, for instance. According to the xe2x80x9cpingxe2x80x9d mechanism, one computer sends an ICMP echo request over the network to another computer, and the receiving computer sends an ICMP echo reply message to the originating machine. When a xe2x80x9cpingxe2x80x9d packet arrives at a computer, the Ethernet layer will detect that a type of IP packet has arrived and will pass it to the IP layer. The IP layer will then determine that the packet is an ICMP packet and will pass it to the ICMP processing code (part of the IP layer). The ICMP processing code will in turn determine that the packet is an echo request packet and will pass the packet to a kernel code routine or xe2x80x9cpingxe2x80x9d routine that responds to echo requests.
Traditionally, applications running in user space have been able to view and operate on packets only when the packets originate in user space or once the packets pass up through the protocol stack into the application layer. Recently, however, computer programmers and network administrators have seen a need to be able to monitor packet traffic through the network and analyze in user-space packets that are being processed by the kernel stack code. For this purpose, many operating systems now provide the ability for user-level processes to xe2x80x9csniffxe2x80x9d or xe2x80x9ccapturexe2x80x9d network traffic, by employing xe2x80x9cpacket tapsxe2x80x9d in kernel space.
In the existing art, a packet tap is a piece of kernel code that examines each packet passing through a particular point in the protocol stack and sends a copy of certain packets to an application running in user space. Often, a packet tap will apply specified criteria such as a predefined xe2x80x9cpacket filterxe2x80x9d to identify those packets that are to be copied to user space. Alternatively, the packet tap may copy all packets into user space, for processing by an application. Typically, the packet filter will be inserted in the kernel code just above the link layer of the protocol stack, in order to monitor packets flowing to and from the network interface card.
An example of one such packet filter is the xe2x80x9cBSD Packet Filterxe2x80x9d or xe2x80x9cBerkeley Packet Filterxe2x80x9d (BPF), as described in McCanne et al., xe2x80x9cThe BSD Packet Filter: A New Architecture for User-Level Packet Capture,xe2x80x9d Proceedings of the 1993 Winter USENIX Technical Conference (Jan. 1993), which is available on most BSD-derived systems. Another example is the xe2x80x9cData Link Provider Interfacexe2x80x9d (DLPI), as described in Data Link Provider Interface Specification, Unix International (Aug. 1991), which is available on Solaris, HP-UX and SCO Unix platforms. Still other examples exist, such as the xe2x80x9cLinux Socket Filterxe2x80x9d (LSF) mechanism, which is BPF-like kernel space filtering code provided in kernel 2.1.75 and higher of the increasingly popular Linux operating system.
Existing packet taps provide a useful mechanism for user-space monitoring of network packet traffic. However, these packet taps do not provide means for altering the packets in user space or for changing the flow of packets. In particular, while packet taps may advantageously identify a packet in the kernel and copy the packet to user space, the original packet (from which the copy was made) will continue to be processed through the protocol stack and on to its destination. In many cases, by the time the application sees the packet, the packet will have already been fully processed by the kernel stack code.
Those skilled in the art may also be familiar with xe2x80x9craw sockets,xe2x80x9d which allow a user-level process to read and write certain IP packets with an IP header that is not processed by the kernel. A xe2x80x9craw socketxe2x80x9d thus effectively allows a user-level process to receive copies of certain incoming packets with their TCP/IP headers intact, and to send into the kernel IP packets with user-specified TCP/IP headers. Raw sockets, however, are inherently limited. For instance, a raw socket cannot be used to receive certain types of IP packets, such as TCP or UDP packets, in user space. Further, a raw socket cannot copy an outgoing packet to user space after modification by the stack. For example, if a user wishes to specify the IP header of a TCP/IP packet using a raw socket, the user must specify both the TCP header and the IP header before sending the packet to the raw socket. No mechanism exists in a raw socket to allow the protocol stack to fill in the TCP header and then allow the user to specify the IP header. Additionally, a raw socket will not allow a user-level process to write packets into the kernel stack in an incoming (upward direction).
In view of the deficiencies in the art, a need therefore exists to provide a system by which a user space application can modify or otherwise manipulate packet traffic over a network.
The present invention provides a method and apparatus for enabling user-space modification of packets. According to a principal aspect of the invention, an improved packet tap is placed at a designated spot in the kernel stack code and employs a packet filter. Upon detection of a packet that matches the filter, the tap intercepts the packet and moves it from kernel space into user space, at least temporarily preventing the kernel from continuing to process the packet. Upon receipt of the packet in user space, an application may then operate on the packet as desired. For instance, (i) the application may modify the packet in some way, (ii) the application may delete the packet from memory (thereby eliminating the packet), or (iii) the application may do nothing to the packet (thereby effectively performing a null operation on the packet). In turn, assuming the packet has not been deleted, the application may inject the packet back into the protocol stack in the kernel for further conventional processing.
FIG. 2 illustrates schematically how packets can be intercepted from the protocol stack, examined or modified in user-space, and then returned to the protocol stack according to the invention. Although the figure illustrates such a mechanism between the Ethernet driver, IP layer and TCP layer, the present invention contemplates placing the tap anywhere in the protocol stack.
Advantageously, the present invention may be implemented with a lightweight modification to kernel code, and an associated application programming interface (API). Provided with the collective ability to divert packets from the kernel stack to user space and inject packets back into the kernel stack from user space, a program running in user space may then examine and manipulate packets on their way through the kernel stack. In addition, provided with the ability to divert packets from the kernel stack to user space, a program running in user space may effectively xe2x80x9cdropxe2x80x9d packets from the kernel stack.
The functionality of the present invention will prove useful for a variety of tasks. As an example, if a programmer wishes to develop a new protocol in TCP/IP that resides at the same level as IP, one method of implementing the protocol would have been to add a specialized protocol handler in the kernel, similar to the xe2x80x9cpingxe2x80x9d and xe2x80x9cIP routerxe2x80x9d handlers shown in FIG. 1. As is known in the art, however, the process of writing, debugging and implementing such kernel code can be time consuming and difficult. For instance, a simple error in kernel code (such as an illegal reference to memory) can cause the kernel to crash, requiring the user to reboot the computer system. With the present invention, however, a programmer can advantageously implement the new IP protocol as a user-level application.
Programming code for execution in user space is easier than programming code for execution in kernel space, because applications can be written using a wide range of pre-existing user libraries. Further, debugging user-level code is easier than debugging kernel code, due to the availability of powerful user-level debugging programs. Still further, in most cases, an error in application code will cause only that application to crash, rather than causing the kernel to crash, thereby minimizing the need to reboot the system.
As another example, the present invention may allow a programmer to test existing or new protocols. For instance, assume a programmer wishes to examine the effect of packet loss during a file transfer using ftp over TCP/IP. By applying the present invention, the programmer can intercept all ftp-packets from just above the network interface card driver in the kernel stack and can divert those packets to an application in user space. The user-level application may then drop certain packets and inject others back into the kernel stack to continue processing up the protocol from the point where they were intercepted.