The rapid growth of computer networks in the past decade has brought, in addition to well known advantages, dislocations and bottlenecks in utilizing conventional network devices. For example, a CPU of a computer connected to a network may spend an increasing proportion of its time processing network communications, leaving less time available for other work. In particular, file data exchanges between the network and a storage unit of the computer, such as a disk drive, are performed by dividing the data into packets for transportation over the network. Each packet is encapsulated in layers of control information that are processed one layer at a time by the receiving computer CPU. Although the speed of CPUs has constantly increased, this type of protocol processing can consume most of the available processing power of the fastest commercially available CPU. A rough estimation indicates that in a Transmission Control Protocol (TCP)/Internet Protocol (IP) network, one currently needs one hertz of CPU processing speed to process one bit per second of network data. Furthermore, evolving technologies such as IP storage, streaming video and audio, online content, virtual private networks (VPN) and e-commerce, require data security and privacy like IP Security (IPSec), Secure Sockets Layer (SSL) and Transport Layer Security (TLS) that increase even more the computing demands from the CPU. Thus, the network traffic bottleneck has shifted from the physical network to the host CPU.
Most network computer communication is accomplished with the aid of layered software architecture for moving information between host computers connected to the network. The general functions of each layer are normally based on an international standard defined by the International Standards Organization (ISO), named the Open Systems Interconnection (OSI) network model. The OSI model sets forth seven processing layers through which information received by a host passes and made presentable to an end user. Similarly, those seven processing layers may be passed in reverse order during transmission of information from a host to the network.
It is well known that networks may include, for instance, a high-speed bus such as an Ethernet connection or an internet connection between disparate local area networks (LANs), each of which includes multiple hosts or any of a variety of other known means for data transfer between hosts. According to the OSI standard, Physical layers are connected to the network at respective hosts, providing transmission and receipt of raw data bits via the network. A Data Link layer is serviced by the Physical layer of each host, the Data Link layers providing frame division and error correction to the data received from the Physical layers, as well as processing acknowledgment frames sent by the receiving host. A Network layer of each host, used primarily for controlling size and coordination of subnets of packets of data, is serviced by respective Data Link layers. A Transport layer is serviced by each Network layer, and a Session layer is serviced by each Transport layer within each host. Transport layers accept data from their respective Session layers, and split the data into smaller units for transmission to Transport layers of other hosts, each such Transport layer concatenating the data for presentation to respective Presentation layers. Session layers allow for enhanced communication control between the hosts. Presentation layers are serviced by their respective Session layers, the Presentation layers translating between data semantics and syntax which may be peculiar to each host and standardized structures of data representation. Compression and/or encryption of data may also be accomplished at the Presentation level. Application layers are serviced by respective Presentation layers, the Application layers translating between programs particular to individual hosts and standardized programs for presentation to either an application or an end user.
The rules and conventions for each layer are called the protocol of that layer, and since the protocols and general functions of each layer are roughly equivalent in various hosts, it is useful to think of communication occurring directly between identical layers of different hosts, even though these peer layers do not directly communicate without information transferring sequentially through each layer below. Each lower layer performs a service for the layer immediately above it to help with processing the communicated information. Each layer saves the information for processing and service to the next layer. Due to the multiplicity of hardware and software architectures, devices, and programs commonly employed, each layer is necessary to insure that the data can make it to the intended destination in the appropriate form, regardless of variations in hardware and software that may intervene.
In preparing data for transmission from a first to a second host, some control data is added at each layer of the first host regarding the protocol of that layer, the control data being indistinguishable from the original (payload) data for all lower layers of that host. Thus an Application layer attaches an application header to the payload data, and sends the combined data to the Presentation layer of the sending host, which receives the combined data, operates on it, and adds a presentation header to the data, resulting in another combined data packet. The data resulting from combination of payload data, application header and presentation header is then passed to the Session layer, which performs required operations including attaching a session header to the data, and presenting the resulting combination of data to the transport layer. This process continues as the information moves to lower layers, with a transport header, network header and data link header and trailer attached to the data at each of those layers, with each step typically including data moving and copying, before sending the data as bit packets, over the network, to the second host.
The receiving host generally performs the reverse of the above-described process, beginning with receiving the bits from the network, as headers are removed and data processed in order from the lowest (Physical) layer to the highest (Application) layer before transmission to a destination of the receiving host. Each layer of the receiving host recognizes and manipulates only the headers associated with that layer, since, for that layer, the higher layer control data is included with and indistinguishable from the payload data. Multiple interrupts, valuable CPU processing time and repeated data copies may also be necessary for the receiving host to place the data in an appropriate form at its intended destination.
A fuller description of layered protocol processing may be found in textbooks such as “Computer Networks”, Third Edition (1996) by Andrew S. Tanenbaum, which is incorporated herein by reference. As defined therein, a computer network is an interconnected collection of autonomous computers, such as internet and intranet devices, including local area networks (LANs), wide area networks (WANs), asynchronous transfer mode (ATM), ring or token ring, wired, wireless, satellite or other means for providing communication capability between separate processors. A computer is defined herein to include a device having both logic and memory functions for processing data, while computers or hosts connected to a network are said to be heterogeneous if they function according to different operating devices or communicate via different architectures.
As networks grow increasingly popular and the information communicated thereby becomes increasingly complex and copious, the need for such protocol processing has increased. It is estimated that a large fraction of the processing power of a host CPU may be devoted to controlling protocol processes, diminishing the ability of that CPU to perform other tasks. Network interface cards (NICs) have been developed to help with the lowest layers, such as the Physical and Data Link layers. It is also possible to increase protocol processing speed by simply adding more processing power or CPUs according to conventional arrangements. This solution, however, is both awkward and expensive. The complexities presented by various networks, protocols, architectures, operating devices and applications generally require extensive processing to afford communication capability between various network hosts.
The seven layer 0SI model is described schematically in FIG. 1. The seven layers are divided into two main groups: Lower Layers (Transport 106, Network 108, Data Link 110 and Physical 112) and Upper Layers (Application 100, Presentation 102 and Session 104). The initials in the parentheses of blocks 106, 108 and 110 are examples of protocols implemented in some systems in each particular layer. At present, the main protocols implemented in Network layer 108 are IP, Address Resolution Protocol (ARP) and Internet Control Message Protocol (ICMP). The main protocols implemented in Transport layer 106 are TCP and User Datagram Protocol (UDP). These protocols are cited hereinafter as by the common name of “TCP/IP” protocols. TCP is described in RFCs 793 and 1122, UDP is described in RFCs 768 and 1122, IP is described in RFCs 791 and 1122, ARP is described in RFCs 826 and 1042, and ICMP is described in RFCs 792 and 1122. The intention was to use these protocols at low bandwidth with low reliability network connections, and they were designed to increase the reliability of the network traffic, guaranteeing delivery and correct sequencing of the data being sent by an application implemented above them.
There are several known initiatives to implement the Network and the Transport layer protocols (especially the TCP/IP protocols) in hardware. For simplicity, implementation of a layer protocol will be referred to hereafter as “implementation of a layer”. Two such initiatives are described in U.S. Pat. No. 6,434,620 “TCP/IP offload network interface device” and U.S. Pat. No. 6,591,302 “Fast-path apparatus for receiving data corresponding to a TCP connection”, both to Alacritech Inc. Both implementations make use of two data paths from the network to the application, a “slow path” and a “fast path”. These two paths cross two different implementations of the Network and Transport layers, as described in FIG. 11 of U.S. Pat. No. 6,434,620. The two implementations therein use respectively numbers 370 and 358 for the Transport layer, and 366 and 355 for the Network layer. However, the OSI model permits only one implementation of each Transport layer protocol (TCP in our case), because at the interface level between the Session layer and the Transport layer, the data received in the Transport layer from the Session layer includes an indication that specifies only the type of protocol. The Transport layer thus knows only the protocol type (TCP in our case) and lacks the information (found only in the Network layer) required to choose one of the two implementations of the protocol. Thus, the system described in the Alacritech patents is in conflict with the standard OSI model, and requires major changes in a system built based on the OSI model.
A typical implementation of the OSI model comprises hardware and software implemented protocols. FIG. 2 shows one such implementation schematically, again as a layer model. In FIG. 2, the seven layers marked 200-212 mirror the 100-112 marking of the same layers in FIG. 1. Protocols in layers 200-208 are implemented in a software section 251 and protocols in layers 210-212 are implemented in a hardware section 270. The hardware section comprises two NICs 260a and 260b. Each NIC comprises a hardware implemented Data Link layer (210a and 210b) and Physical layer (212a and 212b). Both NICs have the same functionality but can differ in the exact implementation. Two software drivers 214a and 214b couple between the software and the hardware sections.
FIG. 3 describes a common hardware implementation of the layer model described in FIG. 2. A CPU 350 performs the tasks of software section 250, i.e. the processing of the Lower Layers (Network and Transport) protocols and of the Upper Layers (Application, Presentation, and Session) protocols, as well as the function of drivers 214, see FIGS. 1, 2. Two NICs 360a and 360b perform the tasks of NICs 260a and 260b in FIG. 2, i.e. the processing of the Data Link layer and Physical layer protocols. A host bus 322 and a host bus bridge 320 are used to provide the connectivity between NICs 360a, 360b and CPU 350. The host bus may be any known bus, for example a PCI local bus as defined by the PCISIG group (http://www.pcisig.com). Each NIC is connected to, and allows communication between an Ethernet Network 324a, 324b and all other elements of the system.
The implementation described in FIGS. 2 and 3 divides the protocol processing load between CPU 350 and NICs 360a and 360b. The processing power required to process the Network 208 and Transport 206 layer protocols is high and proportional to the network throughput, limiting the available processing power left in the CPU for the Upper Layers (200, 202 and 204 in FIG. 2) protocols, especially for the Application layer ones. This is an unacceptable disadvantage.
FIG. 4 represents a typical hardware implementation of a TCP/IP protocol. The seven layers are marked 400-412, mirroring the 100-112 and 200-212 marking of the same layers in, respectively, FIGS. 1 and 2. In this implementation, all Lower Layers protocols (layers 406-412) given a common number 401a are implemented in hardware, and all Upper Layer protocols (400-404), given a common number 401b are implemented in software. A software driver 440 (implemented at a different layer than 214 of FIG. 2) provides the connectivity between the software and the hardware implemented layer protocols. There are two data paths: a first “transmit” or TX data path 420 starting in Application layer 400 and passing through Upper Layer protocols 401b, a first connection 422, driver 440, a second connection 424, Transport layer 406, Network layer 408, and Data Link layer 410 to Physical layer 412; and a second, “receive” or RX data path 430 starting in Physical layer 412 and passing through Data Link layer 410, Network layer 408, Transport layer 406, a third connection 434, driver 440, a fourth connection 436, and through Upper Layer protocols 401b ending in Application layer 400. This implementation suffers from disadvantages described with reference to FIG. 5 below.
FIG. 5 illustrates the problem arising from having one system that comprises an Upper Layers software implementation section 500, and both a hardware implementation 502a of Transport and Network layer protocols, and a software implementation 502b of the same Transport and Network layer protocols. Hardware implementation 502a and software implementation 502b are connected to a first Physical layer 512a and a and second 512b, through respectively a first Data Link layer 510a and a second Data Link layer 510b. Hardware implementation 502a, first Data Link layer 510a and first Physical layer 512a comprise a first hardware block 530. Second Data Link layer 510b and second Physical layer 512b comprise a second hardware block 560. A first driver 540 couples between first hardware block 530 and software implementation section 500. A second driver 514 couples between second hardware block 560 and software section 500 through the software implementation 502b Transport and Network layers. FIG. 5 clearly shows that there are two paths from the Upper Layer protocols (section 500) to the two Physical layers. A left path passes through connection 522a, driver 540 and layers 502a, 510a, and 512a, and a right path passes through connection 522b, software implementation 502b, driver 514 and Data Link layer 510b. However, the OSI model allows only one implementation of the Transport and Network layers as shown in FIG. 1 (106 and 108) and FIG. 2 (206 and 208), while designed to allow implementation of multiple Data Link (210a, b) and Physical (212a, b) layers as shown in FIG. 2. Thus, the implementation shown in FIG. 5 does not meet the OSI specification, and highly complicates the system design. For example, this implementation requires a decision to be taken at the Session layer in section 500, choosing either the left path through connection 522a or the right path through connection 522b for data traffic towards a Physical layer. The Session layer does not have the information needed to make this decision since, according to the OSI model, such information is stored at the Network layer level. Also, having two separate implementations of the Transport and Network layers requires permanent synchronization of the databases of those two layers, in order to keep each implementation aware of decisions made by the other.
FIG. 6 describes in detail a flow chart of a hardware system 600 implementation of block 530 (minus the Physical layer) of FIG. 5. FIG. 6 clearly shows that there is only one transmit (TX) path 620 from the Session layer (not shown) to the Physical layer (not shown), which enters block 600 through a first connection 624, and passes a Transport layer 606, a Network layer 608 and an internal (to block 600) Data Link layer 610, exiting block 600 through a second connection 629. FIG. 6 also clearly shows that there is only one receive (RX) path 630 from the Physical layer to the Session layer, which enters block 600 through a third connection 639, and passes through the internal Data Link, Network and Transport layers, exiting block 600 through a fourth connection 634. Packets processed by the Network layer are said to be sourced from (in the RX path) or directed to (in the TX path) the internal Data Link layer. This hardware configuration does not allow a second (external to block 600) Data Link layer to be connected, since there is only one pair of input/output connections (628/638) between the Network layer and the internal Data Link layer. This limits system 600 to the use of only one (the internal) Data Link layer, limiting the possible number of network connections. A similar problem appears between the Transport and the Network layers. The hardware implementation of the Network layer lacks the flexibility of the software implementation of the same layer, causing stiffness in case of a protocol modification (since the entire protocol is implemented in hardware). For example, a designer may choose to not implement a specific option of the IP protocol, leaving this option to be handled by software running on the host CPU (350 in FIG. 3).
In view of the disadvantages of the hardware implementations above, there is a clear need for, and it would be advantageous to have, hardware implemented network acceleration platforms with enhanced functionality and flexibility, allowing adaptation to changes in existing and future protocols.