Portions of this patent application contain materials that are subject to copyright protection. The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure, as it appears in the Patent and Trademark Office patent file or records, but otherwise reserves all copyright rights whatsoever.
The ubiquity of network applications on wide area networks such as the Internet is driven in part by the robustness of communication protocols that control the location and exchange of data over the network. In these networks, information passing from one system to another is transmitted from layer to layer using one or more predefined protocols. Within each layer, predetermined services or operations are performed, and the layer transmission approach allows a given layer the ability to offer selected services to other layers using a standardized interface while shielding other layers from implementation details in the given layer.
For example, in one model called an open systems interconnection (OSI) reference model specified by the International Standards Organization (ISO), seven layers known as xe2x80x9cphysicalxe2x80x9d, xe2x80x9cdata linkxe2x80x9d, xe2x80x9cnetworkxe2x80x9d, xe2x80x9ctransportxe2x80x9d, xe2x80x9csessionxe2x80x9d, xe2x80x9cpresentationxe2x80x9d and xe2x80x9capplicationxe2x80x9d layers are specified.
Processes carried out in the physical layer are concerned with the transmission of raw data bits over a communication channel. Processes carried out in the data link layer manipulate the raw data bit stream and transform it into a data stream that appears free of transmission errors. The latter task is accomplished by breaking the transmitted data into data frames and transmitting the frames sequentially accompanied with error correcting mechanisms for detecting or correcting errors.
Network layer processes determine how data packets are routed from a data source to a data destination by selecting one of many alternative paths through the network. The function of the transport layer processes is to accept a data stream from a session layer, split it up into smaller units (if necessary), pass these smaller units to the network layer, and to provide appropriate mechanisms to ensure that the units all arrive correctly at the destination, with no sequencing errors, duplicates or missing data.
Session layer processes allow users on different machines to establish xe2x80x9csessionsxe2x80x9d or xe2x80x9cdialoguesxe2x80x9d between themselves. A session allows ordinary data transport between the communicating nodes, but also provides enhanced services in some applications, such as dialogue control, token management and synchronization. Presentation layer processes perform certain common functions that are requested sufficiently often to warrant finding a general solution for them, for example, encoding data into a standard format, performing encryption and decryption and other functions.
Finally, the application protocol layer processes handle a variety of protocols that are commonly needed, such as database access, file transfer, among others. The layers are arranged in order to form a protocol xe2x80x9cstackxe2x80x9d for each node and the stacks are connected together at the physical level end. Thus, data transmission through the network consists of passing information down through one stack in a source system across the physical communication link to another protocol stack and passing the information up the other stack to a target system.
In one protocol called the Transmission Control Protocol/Internet Protocol (TCP/IP), a reliable, connection oriented protocol over (or encapsulated within) IP is provided. TCP guarantees the delivery of packets, ensures proper sequencing of the data, and provides a checksum feature that validates both the packet header and its data for accuracy. The protocol specifies how TCP software distinguishes among multiple destinations on a given system, and how communicating systems recover from errors (such as lost or duplicated packets. It provides a best-effort, connectionless delivery system for computer data. TCP/IP is a four-layered system with an application layer, a transport layer, a network layer, and a link layer. In TCP/IP, each layer has one or more protocols for communicating with its peer at the same layer. For example, two TCP layers running on a network can communicate with each other. The hardware details associated with physically interfacing with a cable/media are defined at the link layer. The network layer, sometimes called the Internet layer, handles the movement of packets around the network. The Internet protocol provides the network layer in the TCP/IP protocol suite. The transport layer provides a flow of data between two hosts for the application layer above. The application layer handles the details of the particular application. Examples of the application layer include remote login applications such as telnet, file transfer protocols (FTP) application, and the simple mail transfer protocol (SMTP) for electronic mail, among others.
When an application sends data using TCP, the data is sent down a protocol stack through each layer, until the data reaches the physical layer where it is sent as a stream of bits across a network. Each layer adds information to the data by prepending headers and/or trailer information to the received data. The data that TCP sends to IP is called a TCP segment, while the data that IP sends to the network interface is called an IP datagram. The stream of bytes that flows across the network is called a frame. When the frame is received at a destination host, the frame works its way up the protocol stack. Along the way, the headers or trailers are removed by appropriate protocol handlers. This is done by examining identifiers in the headers or trailers (demultiplexing). In this manner, packets can be streamed one at time from a source to a destination.
The transmission of individual packets over the network adds communication overhead to the protocol handlers such as the cost to issue a network send/receive/select system call to read/write/wait for each network packets. Network packet aggregation can be used to reduce the communication overhead in a client/server configuration. This technique typically uses a shared buffer that aggregates the packets and adds various headers/trailers to the aggregated packet prior to transmission. However, for systems that transmit small sized network packets such as those in a client-server database, the per-packet processing cost to send or receive a network packet such as the cost of handling headers and trailers becomes high relative to the payload""s per-byte processing cost.
A system aggregates data packets communicated between one or more sessions on a source system and one or more sessions on a target system by: collecting one or more session packets from the one or more source system sessions; multiplexing the session data packets into an aggregated packet; sending the aggregated packet from the source system to the target system; and demultiplexing each aggregated packet into corresponding session packets for delivery to the sessions on the target system.
Implementations of the invention include one or more of the following. The method includes reading an aggregated packet header for each aggregated packet; allocating one or more network buffers based on the aggregated packet header; and scattering the session packets into the one or more network buffers. The method also includes generating a packet header based on the one or more session packets; and sending the packet header and the session packets as the aggregated packet on a data transport. The method can also include combining multiple packet headers into an aggregated packet header; sending the aggregated packet header over one or more control transports; combining multiple session packets into an aggregated data packet; and sending the aggregated data packet over one or more data transports. Additionally, the method can receive the aggregated packet header from the control transport; and for each packet header received from the aggregated packet header, allocate one or more network buffers based on the aggregated packet header; receive the aggregated data packet; and scatter the aggregated data packet into the one or more network buffers based on information from the aggregated packet header. The packet aggregation can also be cascaded, that is, a cascaded session can aggregate data from a previously aggregated session and forward the aggregated packet to a concentrator downstream which in turn performs additional aggregation on the aggregated packets in a cascaded fashion.
In another aspect of the invention, a method aggregates packets for transmission from one or more sessions on a source system to one or more sessions on a target system, each packet having a data portion and a header portion. This is done by collecting one or more packets from the source system; aggregating the header portion of each packet and one or more data portions into an aggregated packet; transmitting the aggregated packet to the target system; extracting the header portion from the aggregated packet; and demultiplexing the one or more data portions into corresponding one or more sessions on the target system based on the header portion.
Implementations of this aspect include one or more of the following. The aggregated header is transmitted together with one or more data portions. Multiple packet headers can be combined into an aggregated packet header. The aggregated header is transmitted over one or more control channels. The data portions are transmitted over one or more data channels. The demultiplexing executes one or more operating system input/output calls. The collecting step performs user input/output multiplexing. Moreover, each session is a database session, an Internet Protocol (IP) session, or an operating system stream.
Advantages of the invention include one or more of the following. The invention improves the utilization of the network bandwidth by aggregating network packets across unrelated client sessions/users to reduce the communication latency as well as the resource requirements to handle small sized network packets. Additionally, the invention obviates the need to perform data buffer copy operations to implement network packet aggregation at the application layer that would be otherwise necessary to achieve packet aggregation across client sessions. Further, the invention avoids resource expensive Input/Output Multiplexing Operating System calls such as Select( ), Poll( ), and IOCompletionPort( ) system calls traditionally used by server processes to concurrently service multiple network clients.
System performance is enhanced since the number of operating system calls needed to send and receive data over the network is reduced. Processor loading is reduced due to fewer I/O operations. Network packets can be aggregated even in relatively small configurations since aggregation reduces network latency. The aggregation of the network packets, along with the minimization of expensive I/O multiplexing calls reduces the processing resource requirements on the server. By aggregating network packets, the invention amortizes and reduces the network communication overhead such as the overhead associated with issuing network send/receive/select system calls. Moreover, the I/O multiplexing is performed at the user level, which eliminates expensive OS kernel calls to poll the transports to detect arriving packets.
Overall system complexity is reduced by avoiding complicated schemes such as buffer sharing. The invention provides high performance without requiring additional processor and memory resources for copying data. Hence, the scalability of the server is improved.
Other features and advantages will become apparent from the following description, including the drawings and the claims.