Since traffic on a physical loop flows through many ports on the loop, including the connecting fiber or copper links, the two primary sources of latency on an arbitrated loop come from:
a) link propagation latency due to link length; PA1 b) node latency (a maximum of 6 Fibre Channel (hereafter FC) words, or 240 bits. PA1 1) a uniform distribution of FC ports connected to Accelerator Hub ports. For example, for 80 ports on the X-axis, a 4-port Accelerator would have 20 FC ports per hub port, and an 8-port Accelerator would have 10 FC ports per hub port. PA1 2) 6 word delays per FC port. Each word delay is equal to 240 ns (40 bits per word at 1 Gb/s) PA1 3) A 70%/30% mix of write operations to read operations PA1 4) 3 FCP (round trip delays per read and 4 FCP round trip delays per write (Command/Data/Status or Command/Transfer Ready/Data Status) PA1 5) 3 round trip delays per FCP Sequence (ARB=1, OPN-RRDY=1, Data/CLS=1) PA1 1) Nonzero BB_Credit PA1 2) TRANSFER mode PA1 3) Dual loops PA1 4) FL ports connected to Fibre Channel switches
These latencies have been demonstrated to be the primary cause of degraded performance for applications such as SCSI over Fibre Channel.
FIG. 1 shows an FC-AL node with a 6-word FIFO. It has a transmit port 1 and a receive port 2. Each FC unencoded 8-bit character is translated to 10 bits when encoded on the serial link via the 8-bit to 10-bit encoding as defined by the Fibre Channel Physical and Signalling Interface (FC-PH) standard, ANSI X3.230-199x. There are four characters per FC word. The 6 FC word FIFO is the cause of the 6 word delay between the time an FC word arrives on receive port 2 and is retransmitted on transmit port 1.
FIG. 2 shows the relationship between data frames and primitives. A primitive is a FC word which occupies the Inter-frame spaces, and has special meaning for flow control and loop management.
The primitives relevant to the invention are shown in FIG. 3. The ARB and OPN primitives contain addressing information. The ARB contains the address of the arbitrating node designated in FIG. 4 as AL_PA. The AL_PA is duplicated in the fourth character. The OPN primitive contains the destination node address (AL_PD) in the third character and the source node address (AL_PS) in the fourth character. Fill words are ARB primitives or IDLE primitives which are used by FC-AL nodes to perform loop signalling. Therefore, all fill words are primitives, but not all primitives are fill words.
FIG. 4 shows a four-node arbitrated loop having the prior art architecture. The output of one node is connected to the input of the subsequent node, and so on. The sum of the latencies of the nodes (each node's 6 word delay) plus the inter-node link propagation delays is referred to as the "system latency."
FIG. 5 shows a "loop tenancy," the hanshaking protocol traffic which occurs between nodes before the loop is relinquished and other nodes are allowed to communicate. A loop tenancy protocol is carried out so that a source node and a destination node can acquire the loop for their exclusive use in a data transfer operation. Each node has a priority ranking which is used during a process called arbitration. Arbitration is a process to decide which of 2 or more nodes which are simultaneously requesting control of the loop will get control of the loop. In the loop tenancy protocol shown in FIG. 5, an arbitration occurs, followed by an open (transmitted by the winning node), followed by transmission of one or more data frames, followed by a close (which can be transmitted by either node).
There follows a more detailed discussion of each phase of the loop tenancy protocol.
Arbitration:
A node knows when it has won arbitration when it sees an inbound ARB primitive containing it's own AL_PA priority ranking. Algebraically small AL_PAs are higher priority than algebraically large AL_PAs. If a port wishes to arbitrate and it receives a lower priority arbitration, it subsitutes its own ARB, i.e., it transmits an ARB with its own AL_PA. If it receives a higher priority ARB, it passes that higher priority ARB.
Open:
OPNs are passed by a receiving port if the destination address does not match the AL_PA of the receiving port.
Permission to Send:
RRDYs (permission to send) are returned by the OPN recipient, i.e., the node having the AL_PA which matches the AL_PD in an OPN primitive. Each RRDY received by the OPN initiator gives the OPN initiator permission to transmit one data frame. In the zero BB_Credit model supported by the second embodiment of the invention, the OPN initiator may not transmit data until one or more RRDYs are received.
Close:
A CLS primitive may be initiated by either node engaged in data transfer operation. If a node receives a CLS and did not originate a CLS, it must forward it. If both nodes originate a CLS simultaneously, since there is no addressing in a CLS, both nodes will believe the incoming CLS from the other node was their CLS, and both will close the loop concurrently.
Peformance:
FIG. 6 shows a series of loop tenancies which make up a SCSI Write operation. There are four loop tenancies in a write operation: one for sending the write command, one for acknowledging receipt of the write command by a disk drive, one for sending the actual write data, and one for acknowledging receipt of the write data by the disk drive. Each ARB and CLS must be passed through every node, while each Data frame and OPN primitive must be passed through, on the average, half the ports on the loop (since Data and OPN are not propagated by the destination node).
With 4 loop tenancies and an average of 3 round trips per tenancy, there are 12 round trips per SCSI write command.
Multiply this by the number of nodes on the loop and again by 6 words per node to get 72.times.N word delays per SCSI write command.
For a fully configured loop (126 nodes), this is 72.times.126=9072 word delays, or 9072.times.40=363 kbit delays. At 1.062 Gbits/sec, this equals 341 microseconds of link overhead per SCSI write command.
For disk drives which have around 500 microseconds of controller software and hardware overhead, having a fixed 341 microsecond delay due just to FC-AL overhead significantly reduces the number of SCSI operations per second per arbitrated loop.
FIG. 7 demonstrates the expected decrease in system latency when the Accelerator Hub is used. This figure assumes: