Fibre Channel Arbitrated Loop (FC-AL) architecture is a member of the Fibre Channel family of ANSI standard protocols. FC-AL is typically used for connecting together computer peripherals, in particular disk drives. The FC-AL architecture is described in NCITS working draft proposal, American National Standard for Information Technology “Fibre Channel Arbitrated Loop (FC-AL-2) Revision 7.0”, 1 Apr. 1999.
Electronic data systems can be interconnected using network communication systems. Area-wide networks and channels are two technologies that have been developed for computer network architectures. Area-wide networks (e.g. LANs and WANs) offer flexibility and relatively large distance capabilities. Channels, such as the Small Computer System Interface (SCSI), have been developed for high performance and reliability. Channels typically use dedicated short-distance connections between computers or between computers and peripherals.
Fibre Channel technology has been developed from optical point-to-point communication of two systems or a system and a subsystem. It has evolved to include electronic (non-optical) implementations and has the ability to connect many devices, including disk drives, in a relatively low-cost manner. This addition to the Fibre Channel specifications is called Fibre Channel Arbitrated Loop (FC-AL).
Fibre Channel technology consists of an integrated set of standards that defines new protocols for flexible information transfer using several interconnection topologies. Fibre Channel technology can be used to connect large amounts of disk storage to a server or cluster of servers. Compared to Small Computer Systems Interface (SCSI), Fibre Channel technology supports greater performance, scalability, availability, and distance for attaching storage systems to network servers.
Fibre Channel Arbitrated Loop (FC-AL) is a loop architecture as opposed to a bus architecture like SCSI. FC-AL is a serial interface, where data and control signals pass along a single path rather than moving in parallel across multiple conductors as is the case with SCSI. Serial interfaces have many advantages including: increased reliability due to point-to-point use in communications; dual-porting capability, so data can be transferred over two independent data paths, enhancing speed and reliability; and simplified cabling and increased connectivity which are important in multi-drive environments. As a direct disk attachment interface, FC-AL has greatly enhanced I/O performance.
Devices are connected to a FC-AL using hardware which is termed a “port”. A device which has connections for two loops has two ports or is “dual-ported”.
The operation of FC-AL involves a number of ports connected such that each port's transmitter is connected to the next port's receiver, and so on, forming a loop. Each port's receiver has an elasticity buffer that captures the incoming FC-AL frame or words and is then used to regenerate the FC-AL word as it is re-transmitted. This buffer exists to deal with slight clocking variations that occur. Each port receives a word, and then transmits that word to the next port, unless the port itself is the destination of that word, in which case it is consumed. The nature of FC-AL is therefore such that each intermediate port between the originating port and the destination port gets to ‘see’ each word as it passes around the FC-AL loop.
FC-AL architecture may be in the form of a single loop. Often two independent loops are used to connect the same devices in the form of dual loops. The aim of these loops is to provide an alternative path to devices on a loop should one loop fail. A single fault should not cause both loops to fail simultaneously. More than two loops can also be used.
FC-AL devices typically have two sets of connections allowing them to be attached to two FC-ALs. Thus, in a typical configuration, two independent loops exist and each device is physically connected to both loops. When the system is working optimally, there are two possible loops that can be used to access any dual-ported device.
A FC-AL can incorporate bypass circuits with the aim of making the FC-AL interface sufficiently robust to permit devices to be removed from the loop without interrupting throughput and sacrificing data integrity. If a disk drive fails, port bypass circuits attempt to route around the problem so all disk drives on the loop remain accessible. Without port bypass circuits a fault in any device will break the loop.
In dual loops, port bypass circuits are provided for each loop and these provide additional protection against faults. A port can be bypassed on one loop while remaining active on the dual loop.
A typical FC-AL may have one or two host bus adapters (HBA) and a set of approximately six disk drive enclosures or drawers, each of which may contain a set of ten to sixteen disk drives. There is a physical cable connection between each enclosure and the HBA in the FC-AL. Also, there is a connection internal to the enclosure or drawer, between the cable connector and each disk drive in the enclosure or drawer, as well as other components within the enclosure or drawer, e.g. SES device (SCSI Enclosure Services node) or other enclosure services devices.
Components in a loop can be categorized as “initiators” or “targets”, or both depending on their function in the loop. For example, a host bus adapter is an initiator and a disk drive is a target. Initiators can arbitrate for a communication path in the loop and can choose a target. A target can request the transfer of a command, data, status, or other information to or from the initiator.
If there is a single initiator in a loop, the initiator will login with all the targets in the loop. Targets may accept or reject this login attempt. At any later stage a target can log out with any logged in initiator. In a multi-initiator environment, an initiator operates as both a sender and recipient login attempts.
FC-AL products have a 7-bit hard address setting for the FC-AL devices. Other loop topologies may have other number of bits. Some of the bits are used to identify the enclosure and the remaining bits of the address identify the devices within that enclosure. There must be sufficient bits for all the devices in an enclosure to be identified individually. In one example of a typical FC-AL system, an enclosure address switch sets the most significant 3 bits of the address and the least significant 4 bits of the address are used to differentiate between the 16 devices within the enclosures. The resultant address is of the form [enc-number, slot-number].
If two enclosures within the same FC-AL loop have the same address switch setting, there will be a bus conflict. The EC-AL addressing scheme is quite sophisticated, so in this case the Loop Initialisation Primitive (LIP) process will result in some of the devices using a method called “soft addressing”.
The nature of FC-AL is that almost all error detection and recovery is on a loop or connection basis. There is almost no link level error recovery. This means that an individual faulty device or link can inject noise into the loop or even break it altogether, rendering it useless for data transfer. In order to overcome this shortcoming, most FC-AL systems are configured using the dual loop arrangement previously described. In such a system, if one loop is rendered inoperative the other loop can be used to recover the system. The failing loop is recovered by arranging for the faulty nodes to by bypassed or electrically removed from the loop.
The algorithm that determines which nodes should be bypassed on the loop is typically implemented by one or more “controlling agents” which might reside in an “outboard controller”, a SCSI enclosure services (SES) device, a Host Bus Adapter (HBA) or a host device driver. For the purposes of this disclosure it is assumed that there is only one controlling agent, residing in an HBA. In order to actually bypass a device, the controlling agent must send a SCSI command to the SES node in the enclosure containing the node, since it is the SES node which has the electrical connection which triggers the bypass circuit. The issue which complicates this task is one of addressing.
Each port in a loop network has a port identifier called a “World Wide Port Name” (WWPN). Each node on a loop in the form of devices or host bus adapters also has a World Wide Node Name (WWNN). These World Wide Names are referred to as Node Names and Port Names. To ensure that the WWPN and WWNN are unique they may contain, for example, a unique identifier of the manufacturer of the device including the port and the manufacturer's serial number of the device. The WWPN is too long (usually 64 bits) to be used for source and destination addresses transmitted over the network and therefore the ALPA (Arbitrated Loop Physical Address) is used as a temporary address that is unique to the configuration of the network at any given time.
Every device has a World Wide Node Name (WWNN) which never changes and which is known to the HBA. The loop initialisation procedure sequence results in the HBA also knowing the arbitrated loop ID of a faulty device which is it's FC-AL address known as the Arbitrated Loop Physical Address (AL_PA).
The SES node however does not know this address; it knows the devices only by the slot number they occupy. Thus the command sent to the SES node to bypass the faulty device is addressed to [ses-node, slot-number] and so the controlling agent must have a reliable means at its disposal to translate from the AL_PA or WWNN to [ses-node, slot-number].
The problem of mapping between AL_PA or WWNN and [ses-node, slot-number] is made more difficult by the possibility of “non-participating devices”. A non-participating device is one which looks to the SES controller as if it exists on the arbitrated loop but which has decided not to participate in arbitration and therefore has not acquired an AL_PA. The presence of non-participating nodes renders unsafe any topology based scheme using the physical topology of the loop reported by the Loop Initialisation Loop Position (LILP) phase of the loop initialisation.
If the controlling agent uses an unreliable scheme to map AL_PA to [ses-node, slot-number] when it sends the command to the SES controller, the result is that the wrong device is fenced out. In that situation there would be two devices which cannot be addressed. This is disastrous in a RAID environment because any data stored in the devices cannot be accessed.
If the controlling agent had an accurate table which mapped the identity of each device to an identifiable enclosure number and slot number then the fencing out process would be much more reliable.
The aim of the present invention is to provide a method which allows a controlling agent to map the device name in the form of the World Wide Name or AL_PA of all devices on a loop to their physical location.