The present invention relates to offloading network communication functions from a host processor.
OSI Layers
The Open Systems Interconnection (OSI) model describes seven layers for a data communications network. This modularization allows the layers to be independently handled. When messages are sent across a network, headers for each layer encapsulate other layers and their headers. In the transmitting direction, each layer may add its own header. In the receiving direction, the appropriate header can be dealt with, then stripped off by one layer, which passes the remaining message to another layer. FIGS. 1 and 2 illustrate these layers and the protocols and hardware that operate at each layer.
1. Physical Layer. This provides for the transmission of data, and handles the electrical and mechanical properties. A repeater functions at this layer.
2. Data Link layer. This layer controls the transmission of blocks of data between network peers over a physical link. A bridge functions at this later. Ethernet is an example protocol.
3. Network Access layer. This layer routes data from one network node to others, using routing information and performing fragmentation and reassembly as needed. Routers function at this layer. Protocols include IP, X.25 and Frame Relay.
4. Transport layer. This layer provides flow control and error control. TCP and UDP are example protocols.
5. Session layer. This layer provides for applications to synchronize and manage their dialog and data exchange.
6. Presentation layer. This provides services that interpret the meaning of the information exchanged. An example protocol is XDR (eXternal Data Representation).
7. Application layer. This layer directly serves the end user. It includes applications such as file transfer and database access. Example protocols are FTP (File Transfer Protocol), NFS (Network File System), CIFS (Common Internet File System), HTTP (Hyper Text Transfer Protocol), database query, SQL (Standard Query Language), and XML (Extensible Markup Language).
Types of Storage
There are multiple ways that data and files can be accessed over a network. FIG. 3 illustrates some of these.
Direct Attached Storage (DAS). Direct attached storage is the term used to describe a storage device that is directly attached to a host system. The simplest example of DAS is the internal hard drive of a server computer, though storage devices housed in an external box come under this banner as well. Other computers on a network can access the data through communications with the host, which handles the communications in addition to its other processing tasks. For example, a disk drive attached to application server 12 in FIG. 3 would be DAS.
Network Attached Storage (NAS). Network Attached Storage is a server attached to a network and dedicated to only file sharing. NAS storage can be expanded by adding more servers, each attached to the network with its own IP address. NAS 14 is shown attached directly to a network through Ethernet switch 16.
Storage Area Network (SAN). A SAN is a subnetwork of storage devices that are connected to each other and to a server, or cluster of servers, which act as an access point to the SAN for clients on a main network. SAN storage is expanded by adding more disks to the subnetwork behind the same server. Storage switch 18 is an example of an access point to storage devices 20 and 22 on a subnetwork accessed through storage switch 18. For example, switch 18 could include a SAN controller, and storage 20 could be a RAID controller which accesses a group of disk drives.
RAID (Redundant Array of Independent Disks) is a system where a group of disks are used together, with data being written across them redundantly or with error correction, providing fault tolerance that allows data recovery where one of the disks fails.
Storage Access Protocols
SCSI (Small Computer System Interface) is a parallel interface used for storage. It provides faster transmission rates than standard serial or parallel ports, and is used to connect computers to disk drives and printers. Many devices can be attached to a single SCSI port, so it is really an I/O bus.
There are two main standard protocols for storage access over a network, both of which use SCSI.
Fibre channel (fiber with an ‘re’) interconnects storage devices allowing them to communicate at very high speeds and allowing devices to be connected over a much greater distance. SCSI commands are still used for the actual communication to the disk drives by the DAS, NAS or SAN server at the end of the fiber.
iSCSI (internet SCSI) encapsulates SCSI commands in an IP packet, allowing data to be transported to and from storage devices over a standard IP network.
Routing and Storage Access Equipment
Routers have been developed to route messages over a network to the appropriate destination. An example of a router is shown in 3COM U.S. Pat. No. 5,991,299.
Specialized network processors have been developed for the specialized flow control and routing of messages. An example of such a network processor is shown in IBM U.S. Pat. No. 6,460,120. Such a processor typically deals with the first three layers of the OSI model. A processor which accesses layers 4 and above for flow control, to make routing decisions based on quality of service, is shown in Top Layer Networks U.S. Pat. No. 6,430,184. This allows distinguishing between priority-based email and bandwidth—guarantee-based multimedia.
At the destination, and at the source, of network communications, the communication is handled by an ordinary computer or server with a general purpose processor. Communication is only one of the functions handled by the processor. With the increasing demands for file access over networks, handing the communication can take an unacceptable amount of the processors time. An example structure at a host connected to a network is shown in FIG. 4.
Network Interface Cards (NICs) handle the layer 1 and layer 2 communication tasks for the end-point processor. NIC 24 is shown connected to the network for this function.
Recently developed TCP/IP Offload Engines (TOE) have been developed to handle the layer 3 and layer 4 communications for the processor, in particular handling the TCP/IP protocol stack. TOE 26 is shown in FIG. 4 between NIC 24 and host 28. An example is the TOE of Alacritech, Inc., such as described in U.S. Pat. No. 6,389,479.
In prior systems, the host processor would run a piece of software commonly referred to as the TCP/IP stack. TOE systems are able to offload this at an interface which requires minimal communication with the host. The host will configure the stack, by providing information such as the domain name, broadcast address, etc. The TOE will then handle establishment of network connections, data transmission and reception, error handling, and connection tear-down when a transmission is completed. Some TOEs, such as those by Alacritech, require the host to establish the network connection, then take over from there.
As shown in FIG. 4, TOE deals with MAC header 29 and TCP-IP header 30, and strips them off from message 32. The message is then forwarded to host 28. In the opposite direction, a message from the host would have the TCP-IP and MAC headers added by TOE 26 for transmission through the network. FIG. 3 illustrates a number of examples of where a TOE could be placed in a network. The TOE processes up through layer 4 of the OSI protocol layers. The higher layers are not dealt with, although the categorization of data in fly-by sequencers, including session level and higher layers, is discussed in US Published Applications 2002/0091844 and 2001/0037406.
Protocols for Accessing Files Over a Network
Accessing files over a network is accomplished using one of a number of protocols, such as File Transfer Protocol (FTP), NFS (Network File System), introduced by Sun Microsystems for sharing files between UNIX systems, and CIFS (Common Internet File System) introduced as a PC networking standard by Microsoft. CIFS was originally known as SMB (Server Message Block).
The commands for accessing data come from Remote Procedure Calls (RPC) from a client across the network, or the NetBIOS (Network Basic Input Output System), an application programming interface (API) on the host that augments its BIOS for network operations. The RPC commands let a remote client run a command on a host across the network.
The data is organized using meta data, which is like an index system for the data. Meta data indicates where the data came from, when it was created or modified, keywords describing the data contents, etc. Meta data can be organized in an External Data Representation (XDR), a presentation layer protocol, originally developed by Sun Microsystems, that allows the exchange of information between different systems and programming languages.
One of the data structures that may be found in meta-data is inodes (index nodes), which contain information about files in UNIX systems. Inodes provide information such as user and group ownership, access mode (read, write, execute permissions), type (regular, directory, special, FIFO), table of contents to disk blocks, file size, and pointers to the data blocks.
The present invention is intended to work with any of the above types of storage, and any communication standard, such as iSCSI or Fibre Channel.