An ATM network is primarily made up of switches and endpoints. Switches are used to move 53-byte ATM cells throughout the network, guiding them from one endpoint to another. Each endpoint is responsible for reassembling the ATM cells into packets of information that can be understood by its host machine. Only the 48-byte payload of the ATM cells is needed to form packets. The 5-byte header is generally processed by the network interface card (NIC) and discarded. See FIG. 1.
Historically, the actual reassembly took place on either the NIC or within the host machine itself. In systems where the packet is fully reassembled in host memory, the ATM NIC would write every cell to host memory as it arrived from the network. This has the advantage of requiring small amounts of buffer memory on the NIC. However, it has the disadvantage of overburdening the host interface bus and memory subsystem with small transfers. In systems where the packet is fully reassembled on the NIC, the problem of inefficient use of host resources is solved but the buffer requirements on the NIC become overwhelming. This is especially evident when you consider that the maximum size of an ATM packet is 64 kilobytes. If a NIC wishes to support 4096 active connections or more, the amount of memory required at any one time is unreasonable.
In order to balance the need for host efficiency with the cost and space limitations put on ATM NICs, a method of partial reassembly was developed.
The present invention overcomes the problems associated with large host DMA read latencies when trying to segment ATM packets at high line rates. In addition, it addresses the problems of choosing when to request packet segments and how to manage the card buffers for each connection.
The present invention addresses two conflicting goals in designing an ATM Network Interface card: 1) The requirement to conform to a given Traffic Contract on the ATM Network and 2) The desirability to fetch large blocks of Packet Data from Host Memory, where "large" is greater than the fixed ATM cell payload size of 48 bytes.
Goal #1 comes from a requirement that data cells which are injected by the NIC into the ATM network must conform to one or more parameters. For example, the time interval between cells may be required to meet a specified minimum time. Cells which violate the parameters may be discarded by elements in the ATM network, resulting in loss of data at the receiving ATM NIC.
Goal #2 results from the design of host memory systems. Current host memory systems are most efficient when two goals are achieved. The first goal is to overcome the "first data latency" problem. The problem is that regardless of the size of a read transfer of contiguous bytes from host memory, there is a certain fixed time delay associated with the transfer. For transfers which are relatively short (less than 32 bytes), the first data latency may be comparable or greater than the time to transfer the actual data. This results in a utilization of the host memory bandwidth ED which is less than fifty percent. By increasing the transfer of data, the first data latency is amortized over a longer data transfer interval. The burst size after which the first data latency is no longer significant is referred to as the optimal burst size (OBS).
The second issue for host memory systems is memory system cache line size and alignment. A host memory system has an ideal minimum size and alignment (IMSA) for data transfer which is a power-of-two in bytes. A typical value of IMSA is 32 bytes, but can range up to 64 or 128 bytes or greater. Not only is there a performance penalty if at least IMSA bytes are not fetched, but the beginning of the transfer must be aligned to an address which is equal to "IMSA*n", where "n" is an integer.
The first conflict arises due to the traffic contract and optimal burst size. If the OBS is greater than a cell payload (48 bytes), then it is most efficient for the ATM NIC to transfer more the one cell from Host Memory at a time. However, the traffic contract for the connection may prevent the NIC from sending the cells sequentially on the ATM network. The second conflict arises when the ideal minimum size and alignment is 32 bytes or greater. Since an ATM cell payload is only 48 bytes, at least half of all cell payload transfers from host memory will not be properly aligned for maximum efficiency.
Previous implementations of ATM NIC have used one of two approaches. The first approach is to transfer a single cell payload at a time from Host Memory. This approach allows the ATM NIC to meet the traffic contract, but it results in lower host memory efficiency. A second approach is for the NIC to transfer entire data packets from host memory to a memory on the NIC and then transmit cells from the memory. While this approach results in good host memory bandwidth utilization, it also requires a very large memory on the ATM NIC a nd increases the latency of transmission.
The solution which the present invention proposes is to transfer data from host memory in units which are smaller than an entire data packet but greater than the OBS and IMSA for the host memory system. The unit of data transferred is referred to as a "Local Buffer". The NIC then transmits cells from these local buffers to the ATM network at a rate which does not violate the traffic contract. Compared to a solution which buffers entire data packets on the NIC before transmission, the invention requires less memory and results in lower latency to the ATM network.
This is the first known architecture to define card segmentation buffers that are larger than an ATM cell. The architecture proved essential to the card's success, especially when interfacing with machines with large read latencies.