Fibre Channel is a switched communications protocol that allows concurrent communication among servers, workstations, storage devices, peripherals, and other computing devices. Fibre Channel can be considered a channel-network hybrid, containing enough network features to provide the needed connectivity, distance and protocol multiplexing, and enough channel features to retain simplicity, repeatable performance and reliable delivery. Fibre Channel is capable of full-duplex transmission of frames at rates extending from 1 Gbps (gigabits per second) to 10 Gbps. It is also able to transport commands and data according to existing protocols such as Internet protocol (IP), Small Computer System Interface (SCSI), High Performance Parallel Interface (HIPPI) and Intelligent Peripheral Interface (IPI) over both optical fiber and copper cable.
In a typical usage, Fibre Channel is used to connect one or more computers or workstations together with one or more storage devices. In the language of Fibre Channel, each of these devices is considered a node. One node can be connected directly to another, or can be interconnected such as by means of a Fibre Channel fabric. The fabric can be a single Fibre Channel switch, or a group of switches acting together. Technically, the N_port (node ports) on each node are connected to F_ports (fabric ports) on the switch. Multiple Fibre Channel switches can be combined into a single fabric. The switches connect to each other via E-Port (Expansion Port) forming an interswitch link, or ISL.
Fibre Channel data is formatted into variable length data frames. Each frame starts with a start-of-frame (SOF) indicator and ends with a cyclical redundancy check (CRC) code for error detection and an end-of-frame indicator. In between are a 24-byte header and a variable-length data payload field that can range from 0 to 2112 bytes.
The header includes a 24 bit source identifier (S_ID) that identifies the source for the frame, as well as a 24 bit destination identifier (D_ID) that identifies the desired destination for the frame. These port identifiers are uniquely assigned to every node in a Fibre Channel fabric. Under the standard Fibre Channel switch fabric addressing scheme, each port identifier is considered to contain three 8-bit words: a domain address or Domain_ID (bits 23-16 of the port ID), an area address or Area_ID (bits 15-8), and a port address or Port_ID (bits 0-7). Each switch in a Fibre Channel fabric is generally assigned a unique domain address. Groups of ports can be assigned to a single area within the switch. The addressing scheme allows 256 ports in each area, 256 areas within each switch, and 239 switches in a fabric (this is fewer than 256 switches because some switch address are reserved). The scheme allows certain routing decisions to be made by examining only a single 8-bit word. For example, a frame could be routed to the appropriate E_Port after examining only the domain address that identifies the switch on which the destination is located.
Fibre Channel switches use the D_ID found in the header of a Fibre Channel frame to route the frame from a source port to a destination port. Typically, this is accomplished using a lookup table at each input port. The D_ID is used as an index to the table, and the table returns the appropriate output port in the switch. This output port will either be directly connected to the node identified by the D_ID, or to another switch along the path to the identified destination. Routing tables are shared between multiple switches in a fabric over an ISL so that the switches can learn about the nodes and switches that make up the fabric.
Routing in modern Fibre Channel switches involves more issues than simply determining a destination port for each D_ID. This is because of the advent of virtual channels and ISL grouping. Virtual channels are used to divide up a single physical link between two ports into multiple logical or virtual channels. In most implementations, virtual channels are used to shape traffic across a port, or to provide more useful flow control across the port. ISL grouping is the ability to establish multiple ISL connections between the same two switches. Rather than treating each path as a separate ISL, ISL groups can be created that treat the separate physical paths as single logical path. Although ISL groups simplify the administration of a fabric and allow a greater ability to load balance across multiple interswitch links, it is still necessary to provide a mechanism to select a particular ISL for each frame to be transmitted over the ISL group. The advent of virtual channels and flow groups has made routing decisions in Fibre Channel switches more complicated. This complication means that traditional methods of routing frames have become too slow, and have become a source of undesired latency within a switch. What is needed is an improved technique for routing within a Fibre Channel switch that would avoid these problems.
When Fibre Channel frames are sent between ports, credit-based flow control is used to prevent the recipient port from being overwhelmed. Two types of credit-based flow control are supported in Fibre Channel, end-to-end (EE_Credit) and buffer-to-buffer (BB_Credit). In EE_Credit, flow is managed between two end nodes, and intervening switch nodes do not participate. In BB_Credit, flow control is maintained between each port. Before the sending port is allowed to send data to the receiving port, the receiving port must communicate to the sending port the size of its input buffer in frames. The sending port starts with this number of credits, and then decrements its credit count for each frame it transmits. Each time the receiving port has successfully removed a frame from its buffer, it sends a credit back to the sending port. This allows the sending port to increment its credit count. As long as the sending port stops sending data when its credit count hits zero, it will never overflow the buffer of the receiving port.
Although flow control should prevent the loss of Fibre Channel frames from buffer overflow, it does not prevent another condition known as blocking. Blocking occurs, in part, because Fibre Channel switches are required to deliver frames to any destination in the same order that they arrive from a source. One common approach to insure in order delivery in this context is to process frames in strict temporal order at the input or ingress side of a switch. This is accomplished by managing its input buffer as a first in, first out (FIFO) buffer.
Sometimes, however, a switch encounters a frame that cannot be delivered due to congestion at the destination port. In this switch, the frame at the top of the input FIFO buffer cannot be transmitted to one port because this destination is congested and not accepting more traffic. Because the buffer is a first in, first out buffer, the top frame will remain at the top of the buffer until this port becomes un-congested: This is true even though the next frame in the FIFO is destined for a port that is not congested and could be transmitted immediately. This condition is referred to as head of line blocking.
Various techniques have been proposed to deal with the problem of head of line blocking. Scheduling algorithms, for instance, do not use true FIFOs. Rather, they search the input FIFO buffer looking for matches between waiting data and available output ports. If the top frame is destined for a busy port, the scheduling algorithm merely scans the FIFO buffer for the first frame that is destined for an available port. Such algorithms must take care to avoid sending Fibre Channel frames out of order. Another approach is to divide the input buffer into separate buffers for each possible destination. However, this requires large amounts of memory and a good deal of complexity in large switches having many possible destination ports.
Congestion and blocking are especially troublesome when the destination port is an EFPort providing an interswitch link to another switch. One reason that the EFPort can become congested is that the input port on the second switch has filled up its input buffer. The flow control between the switches prevents the first switch from sending any more data to the second switch. Often times the input buffer on the second switch becomes filled with frames that are all destined for a single congested port on that second switch. This filled buffer has congested the ISL, so that the first switch cannot send any data to the second switch—including data that is destined for an un-congested port on the second switch. Several manufacturers have proposed the use of virtual channels to prevent the situation where congestion on an interswitch link is caused by traffic to a single destination. In these proposals, traffic on the link is divided into several virtual channels, and no virtual channel is allowed to interfere with traffic on the other virtual channels. However, these techniques do not efficiently track the status of the virtual channels and communicate status changes between the switches.
Switch fabrics that support protocols such as Fibre Channel are generally frame-based and allow variable length frames to be switched from one port to another. However, there are also techniques that use fixed length cells to switch variable length frames, such as that described for example in U.S. Pat. No. 5,781,549. When using fixed length cells for data transmission, the cell size is kept relatively small. In the Ethernet switch described in the '549 patent, for example, variable length Ethernet frames are segmented into 60 bit cells for transmission through the switch. This segmentation is performed by a packet processing unit that is responsible for a group of eight Ethernet ports. Each cell contains a cell header, which contains a packet data byte count and a cell type. The packet data byte count indicates the number of valid data bytes found within the cell. The cell type indicates the type of data found within the cells. There are two cell types that indicate the cell contains actual Ethernet payload data. The first type indicates that the cell does not contain the end of the Ethernet frame. The second type indicates that the cell is the last cell in the Ethernet frame.
The cells are transmitted to Ethernet ports managed by other packet processing units over a shared cell bus. A request to transmit a cell over the cell bus is made by the packet processing unit to a central routing controller. This controller arbitrates competing requests for the shared bus, and grants access to the bus through an acknowledgement signal sent to the selected packet processing unit. Once granted access to the bus, the packet processing unit transmits its data cells over the cell bus. Other packet processing units monitor traffic on the cell bus for cells destined for one of their ports. When cells are discovered, they are reassembled back into Ethernet packets and transmitted out the appropriate Ethernet port.
The Ethernet switch in the '549 patent did not describe the use of a true cell-based switch, since the shared bus configuration meant it was not possible to simultaneously route a plurality of cells between different pairs of source and destination ports. However, true cell-based switches, such as ATM switches, use crossbars that are well known in the prior art. These switches simultaneously route multiple cells through the switch between different pairs of source and destination ports.
Because of the efficiency of these cell-based switches, several vendors have proposed the use of cell-based switches to switch data packets or frames of variable lengths. Like the '549 patent, these proposals segment the frames into fixed-size cells and then transmit the cells through the cell-based switch. Such methods typically require that the number of cells in the packet be known before the packet is sent. That number is placed in the header of every cell in the packet. The cell-based switch uses this information to break the connection through the fabric once the packet transmission has been completed.
Some framing formats indicate the frame length in their header, as is the case with IEEE 802.3 frames. When the beginning of one of these frames enters the switch, the switch can read the header, find the length of the frame in bytes, and calculate the number of cells that will transport the frame. In this case, the process of segmenting the frame into cells can begin almost immediately, with the cell header containing the proper count of cells in the packet length field. This allows the frame to be transmitted through the cell-based switch with a minimum of latency.
The use of cell-based switches to switch Fibre Channel frames 10 is more difficult, since Fibre Channel headers do not contain any information identifying the length of the frame 10 This means that the length of a Fibre Channel frame is not known until the EOF marker is received. It is possible to buffer an entire Fibre Channel frame 10 and count the total number of bytes in the frame. It would then be a simple matter to calculate how many cells will be necessary to accommodate all of the information in the Fibre Channel frame, and then place this value in the cell headers. However, waiting for the entire frame to be buffered before sending the beginning of the frame over the cell-based switch fabric introduces unacceptable latency into the transmission time of the frame (about 20 microseconds at 1 Gbps data rate versus a preferred maximum latency of two microseconds). What is needed is a method to transmit variable length frames that do not contain length information in their frame header over a cell-based switch fabric without introducing an unacceptable level of latency.
In most cases, a Fibre Channel switch having more than a few ports utilizes a plurality of microprocessors to control the various elements of the switch. These microprocessors ensure that all of the components of the switch function appropriately. To operate cooperatively, it is necessary for the microprocessors to communicate with each other. It is also often necessary to communicate with the microprocessors from outside the switch.
In prior art switches, microprocessor messages are kept separate from the data traffic. This is because it is usually necessary to ensure that urgent internal messages are not delayed by data traffic congestion, and also to ensure that routine status messages do not unduly slow data traffic. Unfortunately, creating separate data and message paths within a large Fibre Channel switch can add a great deal of complexity and cost to the switch. What is needed is a technique that allows internal messages and real data to share the same data pathways within a switch without either type of communication unduly interfering with the other.