FIG. 1 is a block diagram representation of an exemplary computer system that receives and transmits packets using a network connection. FIG. 1 shows the path taken by incoming network packets that arrive from the physical receive network links 1 and travel from the Network Interface Controller (NIC) 10 to their ultimate destination in the Computer Memory 20. FIG. 1 also shows the path taken by outgoing network packets that travel from the Computer Memory 20 to the physical transmit network link 2 through the Network Interface Controller (NIC) 10. The Network Interface Controller 10 is linked to the Computer Memory 20 through the Computer Interconnect Network 30. Access to the Computer Memory 20 is controlled by the Memory Controller 40. The computer also contains a Computer Processor Unit 45 on which the operating system and application software executes. The NIC 10, Memory Controller 40, Computer Interconnect Network 30, and Computer Processor Unit 45 functions may be partitioned and implemented in one or more integrated circuits.
FIG. 2(a) illustrates the components of an exemplary network packet which travels over the physical network and is processed by computer systems such as that shown in FIG. 1. The packet data is composed of an Application Payload 79 which is encapsulated in several layers of protocols in accordance with the well known Open Systems Interconnection (OSI) networking model. The physical layer protocol is the outermost level of encapsulation; it introduces Physical Layer Overhead 71 that allows transmission of the packet data on various physical media. Examples of physical layer protocols are 8b/10b and 66b/64b encoding, XAUI protocol and various scrambling methods. Data link layers, such as Ethernet, are the second level of encapsulation; the data link protocol layer is usually composed of the Data Link Protocol Header and Optional Digest 72 as well as an optional Data Link Layer Digest 73 that is used to detect transmission errors. Network protocol layers, such as the Internet Protocol (IP), are the third level of encapsulation; the network protocol layer is usually composed of the Network Layer Header and Optional Digest 74 as well as an optional Network Layer Digest 75. Transport protocol layers, such as the Transmission Control Protocol (TCP), are the fourth level of encapsulation; the transport protocol layer is usually composed of the Transport Layer Header and Optional Digest 76 as well as an optional Transport Layer Digest 77. FIG. 2(b) shows an exemplary Application Payload 79 which is composed of a Data segment that can be optionally encapsulated within one or more Upper Layer Protocol Headers and optional Digests 78.
A typical NIC 10 contains a Packet Transmit Function 17 which allows encapsulation and transmission of outgoing packets on to physical transmit network link 2.
The present invention pertains to the process of receiving packets from the physical receive network link 1. Examining the typical packet receive functional blocks of a Network Interface Controller (NIC) 10, packets arriving from the physical receive network link 1 are first processed by the Receive Physical Layer Block (PHY) 11 which inspects, checks and removes the physical layer protocol overhead 71. Once an arriving packet is extracted from its physical encapsulation protocol layer, it is forwarded to the Media Access Controller (MAC) block 12 which checks and validates the data link layer protocol through examination of the protocol header 72 and digest 73 in relation to the packet data. The MAC Block 12 can also validate the integrity of the network and transport protocol layers through examination of the protocol headers (74 and 76) and digests (75 and 77) in relation to the packet data. The MAC 12 propagates the packet, and the results of the said validations, to the Packet Classifier (PC) 14. The MAC may optionally discard packets without forwarding them to the Packet Classifier (PC) 14 when errors are found in one or more protocol layers. For example in a network using Ethernet for the data link layer, Internet Protocol (IP) for the network layer and the Transport Control Protocol (TCP) for the transport layer, the MAC Block 12 may perform header and digest validation, and protocol checks, for the Ethernet, IP, and TCP protocols.
The Packet Classifier (PC) 14 block inspects arriving packets and determines whether the packets are to be maintained, quarantined or discarded based on a set of user defined rules pertaining to packet length, type, validity of the different protocol headers, and content of the protocol headers for data link, network, transport and upper level protocol layers. Application of these rules represents a firewall function. Quarantine packets are identified by attaching a special tag to them and/or by placing them into a special quarantine queue within the Packet Data Buffer 13.
Each arriving packet that is not discarded by the MAC 12 or Packet Classifier 14 is stored, along with the result of the classification and protocol validation operations performed on the packet, in the Packet Data Buffer (PDF) 13. The Packet Data Buffer 13 may function as a single queue that stores and forwards all incoming packets in a first-in first-out manner. Alternatively, the Packet Data Buffer 13 may be subdivided into multiple queues in which arriving packets are inserted using one or more packet classification methods. In the case of the Packet Data Buffer 13 being subdivided into multiple queues, each of these queues forwards their specific packets in a first-in first-out manner without regard to other queues.
In the case of Packet Data Buffer 13 being subdivided into multiple queues, the Packet Classifier 14 block determines the target queue within the Packet Data Buffer 13 for each arriving packet based on items such as its packet length, type, validity of the different protocol headers, and the content of the protocol headers for data link, network, transport and upper level protocol layers. The result of this classification determines the specific queue in which the arriving packet is to be written within the Packet Data Buffer 13.
The Packet Buffer Manager 16 block manages the process of writing and reading packets into the Packet Data Buffer 13 including monitoring the level of buffer fill and taking appropriate measures, such as selectively discarding or accepting packets destined for the Packet Data Buffer 13 to maintain an appropriate fill level. In the case of the Packet Data Buffer 13 being subdivided into multiple queues, the Packet Buffer Manager 16 manages the process of writing and reading packets into each queue within the Packet Data Buffer 13 including monitoring the fill level of each queue and taking appropriate measures, such as selectively discarding or accepting packets destined for each of the queues to maintain an appropriate fill level.
Packets residing in the Packet Data Buffer 13 are transferred to the Computer Memory 20 through a cooperative effort between the software entities depicted in FIG. 3 and the NIC 10. Software layers existing in the Kernel Space Software Layer (KSSL) 50 of the operating system are the NIC Driver 51, Network Stack 52, and Socket Layer 53. One or more User Application 61 software layers exist in the User Space Software Layer (USSL) 60 of the operating system. A User Application 61 is the software layer which is the final processing destination of any given Application Payload 79. Within the Computer Memory 20 there exists a plurality of Kernel Space Buffers 54 for use by the different software layers within the KSSL 50 and a plurality of User Space Buffers 62 for use by user space software layers such as User Application 61 software layers. To facilitate the transfer of arriving packets from the NIC 10 to the Computer Memory 20, the NIC Driver 51 reserves a plurality of Kernel Space Buffers 54 in which the NIC 10 can write the contents of the arriving packets. All Kernel Space Buffers 54 which are reserved by the NIC Driver 51 and in which the NIC 10 can write packets will also be referred to herein as Kernel Packet Buffers (KPB) 59. If an arriving packet length is larger than the size of the associated Kernel Packet Buffer (KPB) 59, the NIC 10 divides the packet up into smaller data segments that match the size of the available KPBs 59 and then writes the packet segments into several different KPBs 59. KPBs 59 may reside in contiguous memory addresses or may be located at disparate locations in the Computer Memory 20.
The NIC Driver 51 informs the NIC 10 of the starting address and length of each available KPB 59 through a Buffer Descriptor Data Structure (BDDS) 55. FIG. 4(a) shows a typical organization for a Buffer Descriptor Data Structure 55. Within the Buffer Descriptor Data Structure 55, the NIC Driver 51 enters the starting address and length of one Kernel Packet Buffer 59 in each of the Kernel Buffer Fields 70. Each Buffer Descriptor Data Structure (BDDS) 55 may contain one or more Kernel Buffer Fields 70. Within the Buffer Descriptor Data Structure 55, the NIC Driver 51 also enters the Descriptor Ready Flag 57 to indicate that the Descriptor is ready for use by the NIC 10, as well as NIC Driver Private Data 70 that the NIC Driver 51 requires for future use such as determining the virtual addresses of each KPB 59 within the BDDS 55. Each Buffer Descriptor Data Structure 55 contains a Packet Information Area 56 which is left empty by the NIC Driver 51. During the course of its operation, the NIC 10 fills the Packet Information Area 56 with information specific to the incoming packet associated with the BDDS 55 such the packet length, packet header types, results of protocol validation and classification, and all other information required by software layers within the Kernel Space Software Layer 50 to process the packet.
Buffer Descriptor Data Structures 55 are conveyed to the NIC 10 by writing one or more BDDS's 55 into a Kernel Space Buffer 54 as depicted in FIG. 4(b). Herein, we refer a the Kernel Space Buffer 54 that contain Buffer Descriptor Data Structures 55 as Kernel Descriptor Buffer (KDB) 80. KDBs 80 may reside in contiguous memory addresses or may be located at disparate locations in the Computer Memory 20. If KDBs 80 reside in contiguous memory addresses, the NIC Driver 51 provides the address of the first KDB 80 and the length of the contiguous memory space occupied by KDBs 80 to the Direct Memory Access Engine (DMA Engine) 15 located within the NIC 10. If the KDBs 80 reside in disparate memory addresses, each must contain an optional Next KDB Pointer 81 that points to the address of the next KDB 80. In this case the NIC Driver 51 provides the address of the first KDB 80 to the Direct Memory Access Engine (DMA Engine) 15 located within NIC 10. The Direct Memory Access Engine 15 then uses the Next KDB Pointer 81 to locate the next KDB. In the case where the NIC 10 stores the arriving packets in a plurality of queues, the NIC Driver 51 provides a unique set of Buffer Descriptor Data Structures 55 for each of these queues. Each of these unique sets of Buffer Descriptor Data Structures 55 is contained within a unique set of KDBs 80.
The availability of the valid Buffer Descriptor Data Structures 55 can be conveyed to the NIC 10 by many ways that those skilled in the art can appreciate. Exemplary methods to achieve this include having the kernel space software layer 50 write a message, or a fragment of a KDB 80, or the address of the KDB 80 to a doorbell register or memory within the NIC 10. Alternative exemplary methods to achieve this include having the NIC 10 poll the KDB 80 address provided by the kernel space software layer 50 or a special memory location within the Computer Memory 20 that is used for communication between the NIC 10 and the kernel space software layer.
To transfer the arriving packets from the NIC 10 to the KPBs 59, the DMA Engine 15 uses the available KDB 80 address for each of the queues within the Packet Data Buffer 13 to read one or more Buffer Descriptor Data Structures 55 for each of the Packet Data Buffer 13 queues from the Computer Memory 20. The DMA Engine 15 stores a local copy of each Buffer Descriptor Data Structure 55 and modifies the content of this local copy as part of its normal operation. The DMA Engine 15 can only process Buffer Descriptor Data Structures 55 that possess a valid Descriptor Ready Flag 57. If a BDDS 55 with an invalid Descriptor Ready Flag 57 is encountered, the DMA Engine 15 will not process this BDDS 55 until a subsequent time when its Descriptor Ready Flag 57 is found to be in the valid state. For each BDDS 55 with a valid Descriptor Ready Flag 57, the DMA Engine 15 queries the Packet Buffer Manager 16 block to ascertain the availability of packets in the Packet Data Buffer 13. The Packet Buffer Manager 16 reads each available packet from the Packet Data Buffer 13 and passes it to the DMA Engine 15 which writes it into one or more of the KPBs 59 listed in the Kernel Buffer Fields 70 of the valid BDDS 55 within the Computer Memory 20. The DMA Engine 15 modifies the local copy of the Buffer Descriptor Data Structure 55 associated with each packet that is written to the Computer Memory 20, filling in the Packet Information Area 56 with packet information such as packet length, packet header types, results of protocol validation and classification, and all other information required by the software layers within the KSSL 50 to process the packet. The DMA Engine 15 then modifies the Descriptor Ready Flag 57 to indicate that the BDDS 55 is ready for processing by the NIC Driver 51 software layer 51. Finally, the DMA Engine 15 writes back the modified Buffer Descriptor Data Structure 55 into its original location within its KDB 80 in the Computer Memory 20. Once one or more modified Buffer Descriptor Data Structure(s) 55 are written back to the Computer Memory 20, the NIC 10 will interrupt the operating system to inform it of the availability of new packets in the KPBs 59.
In the case where multiple packet queues exist within the Packet Data Buffer 13, the DMA Engine 15 and the Packet Buffer Manager 16 arbitrate between all queues of the Packet Data Buffer 13 such that they are all serviced in a fair manner.
As more packets arrive, the DMA Engine 15 reads the next group of Buffer Descriptor Data Structures 55 for each of the queues in the Packet Data Buffer 13 from the Computer Memory 20. The DMA Engine 15 continues to access successive available packets from the Packet Data Buffer 13 and to write them into one or more available KPB in the Computer Memory 20. If the DMA Engine 15 encounters a Buffer Descriptor Data Structure 55 with an invalid Descriptor Ready Flag 57, it continues to poll that Buffer Descriptor Data Structure 55 by reading the same location of the Kernel Descriptor Buffers (KDB) 80 until it reads a Buffer Descriptor Data Structure 55 with a valid Descriptor Ready Flag 57.
The operating system routes the NIC's interrupt to the NIC Driver 51 software layer. This layer examines the KDBs 80 for each of the packet queues and identifies all newly arrived packets as represented by their Buffer Descriptor Data Structures 55 and corresponding KPBs 59. For each received packet, the NIC Driver 51 transfers the corresponding information, such as the Packet Information Area 56 and pointers to the Kernel Packet Buffer(s) 59 that hold the packet, to the Network Stack 52. The Network Stack 52 software layer is composed of several constituent protocol sub-stacks. Each sub-stack processes and manages a specific network layer and its corresponding protocol. For example, to process packets carrying Ethernet, IP and TCP protocols as the data link, network and transport protocols, respectively, the Network Stack 52 software layer will be made up of intercommunicating sub-stacks where one of the sub-stacks processes the Ethernet protocol, another sub-stack processes the IP protocol and yet another sub-stack processes the TCP protocol. Additional protocol sub-stacks may also be present within the Network Stack 52 layer to process additional upper level protocol header and digests 78. For example, an iSCSI sub-stack may be present within the Network Stack 52 to further process an incoming packet prior to transferring the payload to a User Application 61 software layer. The Network Stack 52 software layer operates on the packet data present in the Kernel Packet Buffer(s) 59 ensuring that the data is valid and removing the packet overhead (headers and optional digests). In doing so, each sub-stack chooses to accept part or all of the packet data. Upon completing its execution, the Network Stack 52 software layer transfers to the Socket Layer 53 one or more pointers to the valid portions of the Application Payload 79 data present within the Kernel Packet Buffer (s) 59 as well as any information required by the Socket Layer 53 to associate the valid portions of the Application Payload 79 to a specific User Application 61 and its User Space Buffer(s) 62.
User Application 61 software layers are not usually allowed to access Kernel Space Buffers 54, such as KPBs 59, and therefore can not access Application Payload 79 data that reside in the Kernel Packet Buffer(s) 59. The Socket Layer 53 must therefore copy the Application Payload 79 data from the Kernel Packet Buffer(s) 59 to the User Space Buffer(s) 62 of a User Application 61 software layer. The Socket Layer 53, which executes on the Computer Processor Unit 45, reads the valid portions of the Application Payload 79 data from the Kernel Packet Buffer(s) 59 within the Computer Memory 20 to the Computer Processor Unit 45 and then writes the same data to the appropriate User Space Buffer 62 which also resides in the Computer Memory 20. These data moving actions constitute a “copy” operation where data is being copied from one location in the Computer Memory 20 to another different location in the Computer Memory 20. This copy operation is undesirable because it consumes bandwidth resources from the Computer Memory 20, the Computer Interconnect 30 which moves the data between the Computer Memory 20, and the Computer Processor Unit 45. Another undesirable trait of the copy operation is that it also consumes some of the available processing cycles of the Computer Processor Unit 45. In some prior art, a Copy Engine 31 is placed within the Computer Interconnect Subsystem 30 to relieve the Computer Processor Unit 45 from having to perform the copy operation itself. In that prior art, the software running on the Computer Processor Unit 45 instructs the Copy Engine 31 to read the valid portion of the Application Payload 79 data from the Kernel Packet Buffer(s) 59 and then write the same data to the appropriate User Space Buffer 62 without the direct involvement of the Computer Processor Unit 45. While this prior art method reduces the impact of the copy operation on the Computer Processor Unit 45, it does not address or solve the problem of bandwidth usage of the Computer Memory 20 during the copy operation.
In the prior art, the Application Payload 79 traverses the Memory Controller three times: first when it is written into the Kernel Packet Buffers 59 by the DMA Engine 15, second when it is read by the different software layers within the Kernel Space Software Layer 50, and third when it is written into the User Space Buffer(s) 62. Therefore, a need remains in the art for a method and apparatus that reduces or eliminates this three-fold usage of Computer Memory 20 bandwidth associated with writing the arriving Application Payload 79 data into the corresponding User Space Buffer(s) 62.