A SAS expander can generally be described as a switch that allows initiators and targets to communicate with each other in a network, and allows additional initiators and targets to be added to the network. The SAS-2 protocol, the specification of which is currently available in draft form (Revision 12, Sep. 28, 2007) at www/t10.org and identified as T10/1760-D or Reference Number ISO/IEC 14776-152:200x, the contents of which are incorporated by reference herein, supports networks including cascades and trees (as well as trees of cascades) of SAS expanders. Typically, a cascade utilizes a single wide-port (containing multiple Phys) for connecting the expanders together with multiple physical connections.
As speeds for SAS increase, the size of SAS networks also increases. However, this increase in size can result in new problems. For cascades of SAS expanders, cascade depths must be kept to a minimum due to the increased congestion in the cascade links, in particular for devices creating connections up and down the cascade.
Another problem with cascades is the possibility of single point failure. For example, if the connection between two cascaded SAS expanders fails, then all connectivity is lost for initiators and targets on opposite sides of the failure point. The single point failure arises because there is no fail-over path for such connections.
Setting aside SAS-2 compatibility for the moment, one way to increase connectivity and provide a fail-over path is to provide a secondary connection. This secondary connection, which essentially forms a loop in the SAS network, can provide a number of advantages. First, the loop can provide increased availability in the face of expander failures or expander enclosure hot plugs. In a failure scenario, the SAS network can fail-over to the loop, bypassing the failure point, to restore connectivity between devices.
FIG. 1 illustrates an exemplary SAS expander network 100 that is conventional except for the addition of a loop connection 102. FIG. 1 shows an initiator (I0) connected via multiple lines to a cascade of SAS expanders, E0, E1 and E2 through wide ports 104 on each expander. Attached to each SAS expander may be one or more SAS or SATA drives D0, D1 and D2. In FIG. 1, loop connection 102 is formed by a connection between port 106 on E0 and port 108 E2. By introducing loops into the network topology, SAS looks more like FC arbitrated loop, but without the loss of I/O during loop initialization primitive (LIP) periods.
In the example of FIG. 1, if the loop connection 102 were not present and the connection between E0 and E1 were to fail, the failure would represent a single point failure, and any devices connected to E0 (e.g. initiator I0 and drive D0) would not be able to communicate with any devices connected to E1 or E2 (e.g. drives D1 and D2). However, with loop connection 102 present, the loop connection provides a secondary or fail-over path to enable I0 or D0 to bypass the failed connection and resume communication with D1 and D2.
In addition to providing a fail-over path, the increased opportunities for connectivity realized by loops provides a number of other advantages to SAS networks, including increased performance due to multi-path access to devices (reduced hop-count). For example, in FIG. 1, if initiator I0 desired to communicate with device D2, it would have to traverse all three expanders E0, E1 and E2 to get to the device. This introduces latency into the communication as well as locking down pathway resources in the SAS network. With a loop connection 102, the initiator I0 or expander E0 could decide that to communicate with device D2, a better path is directly from E0 to E2 through the loop connection. This minimizes an expander hop (in the present example of a three expander network), minimizes congestion in the core, and minimizes latency of the overall connection (which at 6 G can be significant). Note with deeper cascades, of 16 expanders, the savings become more apparent.
Yet another advantage of loops is increased fairness as multiple paths can exist to devices (maximum (N/2)+1).
Notwithstanding these advantages, loops are currently illegal in SAS-2, and create several problems with regard to (1) broadcast change notification (BCN) Management (broadcast primitives), (2) Loop Configuration and Multi-pathing, and (3) Initiator Awareness of Loops.
The first problem relates to BCN management. BCNs are used to notify expanders and other devices such as initiators when there are changes to the network (e.g. the presence of a new device, a removed device, or a malfunctioning device), so that they can re-discover the network. When a change occurs in the network such as a disk insertion, SAS expander or attached device is detected by a port in a SAS expander, a BCN (a SAS primitive containing no address information) is generated and sent to all ports in that SAS expander except for the port that originated the BCN. In a traditional SAS network, the BCN is propagated out through the ports, eventually reaching one or more expanders where it can propagate no further (a leaf or edge expander, for example). However, if a loop is present, a BCN can theoretically propagate forever as it cycles endlessly through the loop.
The second problem relates to loop configuration and multi-pathing. In SAS-2, expanders are self-configuring. Self-configuring expanders take care of their own configuration—route tables, programming, and the like. A self-configuring expander discovers each attached expander and attached devices to program a Phy-based routing table for each port in that expander. In other words, self-configuring expanders enumerate the entire domain by locating every device in the network and creating a routing table for each port, the routing table containing addresses to those devices accessible through that port.
However, if an expander is not aware of a loop connection, it will try to locate devices found through the loop connection and create a route table for the port connected to the loop connection. If a loop is present, the expander can theoretically perform discovery forever as it cycles endlessly through the loop, continuously locating devices and storing addresses of those devices in the routing table. Because a loop exists, each expander can discover the entire SAS network from two directions, which means that each device will be viewed at least twice. In practice, however, the domain may be declared invalid and a vendor-specific action can be taken to prevent discovery from being continuously performed. In any case, both approaches (discovery forever and declaring a bad domain) indicate the problem.
The third problem relates to initiator awareness of loops. Initiators attempt to enumerate the entire domain to which they are attached. For example, in FIG. 1, the initiator will attempt to discover all information about devices attached to E0, then E1, the E2. However, because of the loop on E2, the initiator will then attempt to discover all information about devices attached to E0 again, and this enumeration or discovery process can go on forever if the initiator is not designed to detect the problem. Initiators capable of detecting the problem can declare a bad domain and report it to the host to prevent discovery from going on forever.
These problems, if left uncorrected, can result in a non-operating network. Therefore, there is a need to be able to utilize loop configurations in SAS networks without creating problems related to BCN management, loop configuration and multi-pathing, and initiator awareness of loops.