1. Field of the Invention
This invention relates to managing data transmission and to network equipment capable of managing data transmission.
2. Related Art
When data is to be transferred between two devices over a data channel, each of the devices must have a suitable network interface to allow it to communicate across the channel. The devices and their network interfaces use a protocol to form the data that is transmitted over the channel, so that it can be decoded at the receiver. The data channel may be considered to be or to form part of a network, and additional devices may be connected to the network.
The Ethernet system is used for many networking applications. Gigabit Ethernet is a high-speed version of the Ethernet protocol, which is especially suitable for links that require a large amount of bandwidth, such as links between servers or between data processors in the same or different enclosures. Devices that are to communicate over the Ethernet system are equipped with network interfaces that are capable of supporting the physical and logical requirements of the Ethernet system. The physical hardware components of network interfaces are referred to as network interface cards (NICs), although they need not be in the form of cards: for instance they could be in the form of integrated circuits (ICs) and connectors fitted directly on to a motherboard.
Ethernet and some other network protocols use an XON/XOFF system to manage flow control. When a network is congested and wishes to exert backpressure so as to prevent a data transmitter from transmitting it transmits an XOFF message to the transmitter. When data transmission is to start again, it transmits an XON message to the transmitter. Other network protocols use properties such as rate control or credit allocation to achieve a similar function.
Data for transmission must be passed by a processor or other device to the NIC. Conventionally this is done over a bus using DMA (direct memory access) or load-store operations.
In general it is highly desirable to use load-store operations to implement a user-accessible interface from a computer to a network because of the low overhead and low latency by which an application is able to transfer data to the NIC using load-store. Even so, DMA is still used for large transfers because it allows the task of transferring the data to be offloaded from the main processor. When PIO (programmed input/output) is in use it is imperative that as far as possible the NIC is always able to accept the PIO transfer, otherwise the cost of the feedback mechanism is likely to outweigh the benefits of the PIO access.
Generally data is transferred from the processor's cache, over the memory (front-side) bus, via an IO controller and the IO bus to the NIC. Typically the IO bus is the PCI (peripheral component interconnect) The (PCI) bus protocol often requires that once a target device has accepted a data transaction from a master, that some data must always be able to pass through, otherwise the bus protocol is violated. This requires that although the data rate across the bus may be slowed, it must not fall to zero over a certain time interval (e.g. 10 microseconds). Otherwise, the bus may crash, or at the very least, system performance for other devices will become badly degraded. Similarly a target device must respond to a new request within a certain time interval (e.g. 1 millisecond). As a result, if the network is in an XOFF state for a considerable amount of time, the NIC must stop the PIO stream of data from the processor using another means. That means is generally an interrupt. However, the use of interrupts raises problems. First, excessive use of interrupts would negate many of the benefits of the PIO protocol. Second, on a multi-processor machine it may take a considerable time for an interrupt to shut off the data stream from an arbitrary user-level application. This is because the application may be being handled by a different processor from the one that receives the interrupt. Third, since the bus protocol typically encourages bursty data, using interrupts to pass back flow control information can be an excessively harsh mechanism.
Note that DMA transfer does not suffer from these problems—if the network is congested the NIC simply does not request more data.
Another problem arises due to the difference in data format between a typical IO bus and a typical network protocol. Data transmitted to a NIC over an IO bus tends to be bursty because load-store operations are generally performed at the granularity of the number of the processor's registers at once, and because the boundary of the IO bus tends to be at the write buffer of the processor. The bursts tend to be around four to 16 words long, depending on the processor. By contrast Data received by a NIC by DMA tends to be in much bigger bursts, for example of up to 256 words. When the data is received at the NIC, it is generally formed into network packets, for example Ethernet packets. Network packets generally have a maximum size and a relatively large minimum size, which are specified by the network protocol. For instance, Ethernet packets have a minimum size of 64 bytes and a maximum size of typically 1500 bytes, although some Ethernet (and some other networks) can be configured to permit larger packet sizes.
The specification for the Ethernet physical layer stipulates that once sending of a packet has begun, the sending of the packet must be completed. This is usually enforced by the MAC layer of the NIC. Therefore, when it is receiving data for transmission in Ethernet packets the NIC must make a decision on how many bursts it should wait to receive before forming the received data into a packet for transmission. Waiting for relatively many bursts to arrive before forming a packet makes for high average latency on the network link, since there can be a considerable delay before data received at the NIC is formed into a packet. On the other hand, forming Ethernet packets from relatively few bursts introduces bandwidth overheads.
The NIC contains a packetisation engine, which forms packets from the data received for transmission. A number of strategies have previously been employed for determining how much data to wait to receive before forming a packet. Some systems (e.g. SCI) have employed heuristics, but even this can produce poor results in some situations. Also using heuristics imposes a considerable processing load on the NIC.
There is therefore a need for an improved mechanism of managing data transmission.