1. Field of the Invention
The present invention relates, in general, to network data communications, and, more particularly, to an extended link service that provides information about link incidents on a fibre channel fabric.
2. Relevant Background
Fibre Channel is a high performance serial interconnect standard designed for bi-directional, point-to-point communications between servers, storage systems, workstations, switches, and hubs. It offers a variety of benefits over other link-level protocols, including efficiency and high performance, scalability, simplicity, ease of use and installation, and support for popular high level protocols.
Fibre channel employs a topology known as a "fabric" to establish connections (paths) between ports. A fabric is a network of switches for interconnecting a plurality of devices without restriction as to the manner in which the switches can be arranged. A fabric can include a mixture of point-to-point, circuit switched, and arbitrated loop topologies.
In Fibre channel, a path is established between two nodes where the path's primary task is to transport data from one point to another at high speed with low latency, performing only simple error correction in hardware. The fibre channel switch provides flexible circuit/packet switched topology by establishing multiple simultaneous point-to-point connections. Because these connections are managed by the switches or "fabric elements" rather than the connected end devices or "nodes", fabric traffic management is greatly simplified from the perspective of the device.
To connect to a fibre channel, fabric devices include a node port or "N.sub.-- Port" that manages the fabric connection. The N port establishes a connection to a fabric Element (e.g., a switch) having a fabric port or F.sub.-- port. Devices attached to the fabric require only enough intelligence to manage the connection between the N.sub.-- Port and the F.sub.-- Port. Fabric elements include the intelligence to handle routing, error detection and recovery, and similar management functions. Fabric elements provide these functions by implementing a variety of services. Although some basic services are required to be provided by all fibre channel devices, a wide variety of optional or extended services can be implemented to provide additional functionality.
The fibre channel structure is defined as a five layer stack of functional levels, not unlike those used to represent network protocols. The five layers define the physical media and transmission rates, encoding scheme, framing protocol and flow control, common services, and the upper level application interfaces. FC-0, the lowest layer, specifies physical characteristics of the media, transmitters, receivers and connectors. FC-1 defines the 8B/10B encoding/decoding scheme used to integrate the data with the clock information required by serial transmission techniques. FC-2 defines the framing protocol for data transferred between ports, as well as the mechanisms for using Fibre Channel's circuit and packet switched service classes and the means of managing the sequence of a data transfer. FC-2 is often referred to as the "link level". FC-3 is undefined and currently is not used. FC-4 provides integration of FC-2 level frames with existing standards and protocols such as FDDI, HIPPI, IPI-3, SCSI, Internet Protocol (IP), Single Byte Command Code Set (SBCCS), and the like.
Services can be readily provided at the FC-2 and FC-4 levels (or the FC-3 level when implemented). Each data packet sent between fabric elements includes a "header" that includes fields holding addressing and other packet-specific information. A certain number of codes are reserved in the header to identify packets that are providing services from packets that are transferring user-level data. Upon receipt of a packet having a recognized service code in the header, a receiving device knows that the payload data is not regular information traffic. If the receiving device offers the service specified by the code, it will execute routines to implement the service. If the receiving device does not offer the service specified by the code, it returns a packet to the sender with a header code indicating that the particular service is not supported. This system enables the variety of services offered by a particular device to be expanded so long as code space remains to uniquely identify each service.
A "link incident" refers to an event that has effected or may effect the ability to transfer data over a link. A link incident may be caused, for example, by a variety of hardware or software failures, limitations in the original fabric design, or unexpected traffic volume. For example, a broken or disconnected fibre optic link, a port electronics failure, misconfigured cabling, a degrading transmit laser, or the like may result in a link incident. Not all link incidents render the link non-operational, and the port in which the link incident occurs may have some ability to recover from any particular incident without affecting other devices in the fabric.
After a link disruption or malfunction such as loss-of-light on the link interface, it is useful for a centralized intelligence to analyze the link incident and take appropriate remedial action. This is particularly true in communication systems where data is transported asynchronously as a link failure or degradation cannot be immediately, unambiguously detected simply from a failure to receive data. As a result, prior systems without link incident reporting may not detect a link failure or degradation and instead continue to route data over the failed link.
Moreover, to effectively address a link incident it is useful to have information about the incident in addition to merely detecting that an incident has occurred. However, past systems lack a system and methodology for exchanging link incident information in a manner that lends itself to centralized collection and analysis of the link incident data. As a result, link incidents were often difficult to diagnose and correct. Accordingly, a need exists for a system and method for providing link incident report management in an asynchronously connected data communication environment.