1. Field of the Invention
The present invention relates, generally, to the initialization of devices in an arbitrated loop, and in particular embodiments, to the monitoring, detection, removal and recovery of malfunctioning devices from an arbitrated loop during initialization.
2. Description of Related Art
Non-blocking crossbar switches for storage systems (storage switches) are generally implemented in several configurations. In one configuration, the storage switch is connected within an enclosure to an arbitrated loop of drives referred to as Just a Bunch Of Disks (JBODs), and the enclosure is referred to as a Switched Bunch Of Disks (SBOD). In another configuration, the storage switch is contained in an enclosure referred to as a root switch, which connects externally to a number of JBODs. In yet another configuration, a root switch is connected to SBODs in a fully switched architecture.
In any of these configurations, when a device is first connected to a port on the storage switch, all of the devices previously connected to that storage switch must be initialized. Note that a device, as referred to herein, includes, but is not limited to, disk drives, host bus adapters (HBAs) and other Fibre Channel (FC) devices. For example, FIG. 1a illustrates an exemplary storage switch 100 including ports 118, 112, 114 and 116, and a processor 138. It should be understood that a four-port storage switch 100 is illustrated herein for purposes of explanation only, but that other commercially available storage switches may have a different number of ports. In the example of FIG. 1, the storage switch 100 is initially connected to devices 102, 104 and 106 via the ports. When a new device 108 is first connected to the storage switch 100 at port 118, its operation may be verified before it is inserted into the loop using a well-known procedure referred to as Port Test Before Insert (PTBI). If PTBI is enabled in the firmware being executed by the processor 138, the processor 138 performs a PTBI event upon the connection of the new device 108 into the storage switch 100.
The PTBI event first instructs the storage switch 100 to configure port 118 in a loop back mode, so that new device 108 is essentially configured in an individual loop, isolated from all other devices connected to the storage switch 100. Processor 138 then sends a number of Loop Initialization Primitive (LIP) ordered sets 122 (a four-byte sequence) to new device 108 to start an individual LIP cycle. This individual LIP cycle also results in the starting of a device monitor timer in processor 138. After the device monitor timer times out, processor 138 evaluates new device 108 to determine if it is behaving properly or malfunctioning. At the physical layer, new device 108 can be identified as malfunctioning if cyclic redundancy check (CRC) errors were generated, if there were ordered set (OS) errors, or if there were bad transmission words. At the FC protocol level, new device 108 can be identified as malfunctioning if the new device 108 does not return a start of frame (SOF), IDLE, arbitrate (ARB) or end of frame (EOF) ordered set, or if the new device 108 does not return a close (CLS) ordered set 126 (which signifies the end of the individual LIP cycle). As long as new device 108 is found to be malfunctioning, PTBI events are repeatedly performed.
As illustrated in FIG. 1b, if the PTBI event indicates that new device 108 is operating properly, processor 138 initiates another LIP cycle by sending out additional LIP ordered sets 110 to new device 108. The purpose of this LIP cycle is to initialize both ends of the communication link by identifying all devices connected to the storage switch 100 with a unique address, and establish a common format for communications.
When the LIP cycle is started, the storage switch 100 sends a LOOP_DOWN event associated with port 118 to the processor 138, and configures all ports in the storage switch 100 connected to a device into a loop configuration 132. Note that because all of the devices are connected in a loop 132, and all ports are involved in the initialization, the transmission of data through the ports must stop during the LIP cycle.
After the at least three LIP ordered sets 110 are received by the new device 108, the LIP ordered sets 110 are then propagated to device 102, and the storage switch 100 sends a LOOP_DOWN event associated with port 112 to the processor 138. The LIP ordered sets 110 are then propagated to the next device 104, and the process continues. In general, the storage switch 100 sends a LOOP_DOWN event associated with each port in the loop to the processor 138 as the LIP ordered sets 110 are passed along to the next device in the loop.
The new device 108 then sends out a Loop Initialization Select Master (LISM) frame 140 to all devices in the loop 132, one by one, with the address (worldwide name) of the new device 108 initially stored into the LISM frame 140. As other devices in the loop receive the LISM frame 140, the address stored in the LISM frame 140 is checked. If a device has a greater worldwide name (lower number), it replaces the address in the LISM frame 140 with its own greater worldwide name. Eventually, the LISM frame 140 arrives back at the new device 108, containing the greatest worldwide name of any of the devices in the loop 132. The device with the greatest worldwide name (lowest number) is designated as the loop initialization master (LIM) of the initialization phase. The new LIM device sends ARB(f0) ordered sets around the loop to indicate that a master has been selected. From that point forward, the device designated as the LIM sends out future initialization frames.
In the example of FIG. 1b, assume that device 104 is designated as the LIM. The LIM then sends out a loop initialization fabric address (LIFA) frame 142 to all devices in the loop 132, one by one, which ensures that if any device was fabric attached and had a fabric address, it will retain that fabric address. In particular, devices with fabric addresses receive the LIFA frame 142 and can request that same fabric address by partially filling in the LIFA frame 142 with their fabric address. As the LIFA frame 142 makes its way around the loop 132, the fabric addresses being reserved are stored in the LIFA frame 142, so the same address cannot be assigned to two different devices. Note that every device on the loop 132 must have a unique one-byte Arbitrated Loop Protocol Address (ALPA), which can have 127 possible values. When the LIFA frame 142 has made it all the way around the loop 132, the LIM receives the LIFA frame 142, signifying the end of the LIFA phase. The LIM then sends out a loop initialization previous address (LIPA) frame 144 to all devices in the loop 132, one by one, which ensures that if any device had a previous address, it would retain that previous address. In particular, devices with previous addresses receive the LIPA frame 144 and can request that same previous address by partially filling in the LIPA frame 144 with their previous address. The LIM then sends out a loop initialization hardware address (LIHA) frame 146 to all devices in the loop 132, one by one, which ensures that if any device had a hardware address (e.g. hardcoded by dual in-line package (DIP) switches), it would retain that hardware address. In particular, devices with hardware addresses receive the LIHA frame 146 and can request that same hardware address by partially filling in the LIHA frame 146 with their hardware address. The LIM then sends out a loop initialization software address (LISA) frame 148 to all devices in the loop 132, one by one, which ensures that if any device had a software address, it would retain that software address. In particular, devices with software addresses receive the LISA frame 148 and can request that same software address by partially filling in the LISA frame 148 with their software address. Optionally, the LIM then sends out a loop initialization report position (LIRP) frame 150 to all devices in the loop 132, one by one, which is a frame that provides the address and a mapping of every device in the loop, and a loop initialization loop position (LILP) frame 152, which shows all devices in the loop 132. Each device on the loop adds their address to the LIRP frame in turn, and the LILP frame repeats the completed LIRP frame so that everyone gets the same information about loop position.
If the operations described above are completed without errors, indicating a successful initialization, the LIM sends a CLS ordered set 120 to all devices in the loop 132, one by one. As the CLS ordered set 120 propagates through each device and port in the loop 132, the storage switch 100 sends a LOOP_UP event associated with each port to the processor 138. When the CLS ordered set 120 has propagated all the way back to the LIM, signifying the end of the LIP cycle, the storage switch 100 re-configures those ports in the storage switch 100 connected to a device for normal operation. After all ports are once again configured for normal operation, data can flow again.
Because no data can flow during a LIP cycle, it is desirable to minimize the amount of time that the devices are in a LIP cycle in order to maximize throughput. However, problems may occur that prevent the LIP cycle from completing and sending out a CLS ordered set, either indefinitely or for a predetermined amount of time. For example, at a hardware level, one of the devices may malfunction and not pass frames around the loop, or may pass invalid frames. Furthermore, because the storage switch may be partially implemented as a state machine, errors in the state machine may occur. In addition, at a system level, there may be unresolvable contention between devices for the LIM in the LISM phase of the LIP cycle, resulting in multiple LISMs being repeatedly sent (a “LISM storm”) without any resolution of who is the LIM. There may also be firmware incompatibilities between a device and a storage switch.
If the LIP cycle cannot run to completion as signified by a CLS ordered set, the new device 106 may repeatedly retransmit the LIP ordered set. This retransmission of the LIP ordered set may continue indefinitely. As long as no CLS ordered set is received, it will appear that a LIP cycle is still in progress, and the loop will not be able to resolve itself. The flow of data will remain blocked, reducing throughput to zero.
The initialization problems described above may be difficult to isolate. It is possible that communications within the loop during the execution of a LIP cycle may be completely adhering to the FC specification, and yet there still may be problems with initialization. Because the only manifestation of a problem LIP cycle is the lack of a CLS ordered set, it is also impossible to determine where the failure occurred. In conventional storage switches, one way to terminate a LIP cycle that has not run to completion with a CLS ordered set is to manually take the storage switch down and test each device individually (using a PTBI operation) until the malfunctioning device is located. Another alternative is to cycle power to the storage switch. The storage switch will then treat each device connected to it as a new device being inserted, and will run a PTBI operation on each device.
Therefore, there is a need to monitor, detect and remove malfunctioning devices from an arbitrated loop on a per-port basis without having to manually take the switch down and test each device individually until the malfunctioning device is located.