Present invention is disclosed by using implementation examples for Storage Area networks using Fibre Channel protocol.
Nevertheless, the invention can be applied in general to similar networking contexts.
A SAN or Storage Area Network is a network whose primary purpose is to transfer data between computer systems and storage elements and among storage elements.
A largely used protocol in the SANs is the FC (Fibre Channel) protocol as standardises by ANSI (American National Standard Institute): ANSI X3.303-1998, Fibre Channel Physical and Signalling Interface-3 (FC-PH-3).
Such a protocol allows for an optimised transfer of large data blocks with high performance.
In the following, basic information on the FC protocol is described in order to better clarify the implementation examples.
The FC protocol provides for both the feature of an I/O bus and the flexibility of a network protocol and is adapted both for short distance networks (copper line) and long distance networks (optical fiber) up to a few hundreds of km. FC Network topologies comprise point-to-point topology, a ring type topology named Arbitrated Loop and arbitrary meshed topologies named switched fabrics.
The FC protocol is organised, according to the standard, on 5 layers, approximately corresponding to the first 4 layers of the OSI stack.
The FC-0 layer defines the physical layer of the transmission and describes the electric and optical parameters of the transmitters and receivers of an FC port. Therefore the FC-0 layer defines an analogical interface (generally an optical interface) towards the physical connection, and a digital interface towards the FC-1 layer, with the information being exchanged in form of bit streams with the FC-1 layer.
The layer FC-1 defines the transmission protocol, and deals with the line coding, the generation and termination of the control frames, the delimitation of the data frames, the management of the flow control on the link. Moreover, layer FC-1 defines the operational states of a transmitter and a receiver, and in general of an FC port which, according to the FC standard, comprises a receiver and a transmitter. In the FC protocol, and more precisely at layer FC-1, a channel encoding of 8B/10B type is used to improve the error detection mechanisms and to help bit, byte and word (made up by 4 byte) synchronisation recovery.
The basic information unit transmitted on the link is a sequence of 10 bits, defined as a “transmission character” (TC), belonging to an alphabet of 1024 (210) elements. 512 of these elements are used to code data bytes (8 bits); for each of the 256 (28) possible bytes of data there exist two coding transmission characters. Moreover, according to the 8B/10B coding, further twelve characters of the alphabet are reserved for the transmission of service information.
According to the FC protocol, only one of the twelve characters is used: this character is named “comma”, and it is used at the receiving side for identifying the beginning of an “Ordered Set”.
In a system using FC protocol the stream of exchanged information comprises a sequence of Transmission Words (TWs), that are a sequence of 4 TCs.
TWs are divided into data words and control words. The control TWs are called “Ordered Set” and are distinguished from the data TWs by the first of the four 10B TCs which, in control TWs, is a “comma”.
Such “Ordered Sets” (ordered sets) are used to delimit a data frame or a control end-to-end frame (SOF, Start of Frame and EOF, End Of Frame), to exchange information concerning the control of buffer-to-buffer stream at link layer (R_RDY), to implement a number of procedures related to the management of a link or port state (“primitive signals”). In particular, the ordered sets of the IDLE type are used for maintaining the link active and initialised during the transmission when there are no frames to be transmitted, thus allowing the receiver to keep the synchronisation at bit, character and word levels.
These ordered sets are distinguished one from another by the informative content of the 10B characters in the second, third and fourth positions of the transmission word TW.
A transmitting FC port has to interpose a minimum of 6 ordered set of the IDLE or R_RDY types between two subsequent frames, while in a received signal there must be at least two ordered sets belonging to such categories (IDLE or R_RDY types) between two frames.
Namely, along the FC link between the two client devices IDLE frames can be either extracted or inserted to compensate for possible differences between the synchronism frequencies of the receiver and the transmitter. The lower limit of 2 ordered sets is used for allowing the receiver a time enough for processing a received data frame before the arrival of the next one.
When preparing the data to be transmitted according to the 8B/10B coding, the choice on which (of the two) characters to use for coding a data byte is accomplished by trying to balance the average number of bits 1 and bits 0 in the stream of the transmitted data to reduce as much as possible the low frequency component of the signal, as known to a skilled in the field.
The difference between the number of transmitted 0s and 1s is defined as “running disparity” (RD). At the transmission side such difference is used for choosing the characters through which the data bytes are coded. At the receiving side it is recalculated and used as a further tool for checking the presence of errors in the received data stream.
A transmission character with a number of bits 1 larger than that of bits 0 is defined as having a positive RD, whereas RD is defined negative when the number of bits 0 is larger than the number of bits 1. If the numbers of bits 0 and of bits 1 are equal, then the RD of the character is defined “neutral”. At the receiving side, the RD of a receiver is defined as the RD value expected for the next character.
Likewise to the transmissions character, each transmission word has an RD of its own, that can be positive, negative or neutral.
After receiving an ordered set of the EOF, IDLE, or R_RDY type (and generally for all the ordered sets but for SOF), the expected RD for the next character to be received is always negative. This way such ordered sets can be extracted from or inserted into the data stream at the intermediate nodes of an FC connection without altering (and requiring a recalculation on the RD of the stream.
When a received character does not match the expected RD value, or the received character is merely invalid (i.e. not used for exchanging meaningful information and therefore not belonging to the set of control characters or to the set of characters for coding data bytes), such character generates a “code violation error”. Namely, at the receiver each character is decoded by looking for the corresponding 8B byte in the set of the 10B TC having the expected RD value. When the RD of the received word is different from the expected one, it cannot be successfully decoded thus causing a code violation error.
Also in case of error, the RD of the 10B received word is anyhow used for updating the actual count of RD and therefore the expected RD for the next 10B word.
In recent years there has been an increasing interest towards the extension of the SAN to metropolitan and geographical areas, although this type of network was originally devised for limited areas such as the intra-building networks. The main reason of such interest comes from the possibility of realising backup copies of strategic information at different locations and safeguarding data integrity in case of catastrophic events (fires, earthquakes, floods, etc.) that are likely to strike only one of such locations.
The above extension of the SAN implies the transport of FC signals through the networks used in geographical-metropolitan connections, such as SDH (Synchronous Digital Hierarchy) also known as SONET (Synchronous Optical NETwork) transmission system, and WDM (Wavelength Division Multiplex) in order to use transport means already developed for other types of services. From an architectural point of view the interconnection between two FC devices is illustrated in the general scheme of FIG. 1 where two devices labelled FC Client are connected to each other through terminal apparatuses of a WAN/MAN (Wide Area Network/Metropolitan Area Network) network and the FC streams is properly adapted to the transmission along the network. In this scheme, the FC clients represent any devices equipped with FC ports, such as switches of a FC SAN, computer systems with FC HBA (Host Bus Adapter) or storage systems with FC controllers.
In case an SDH network is used, the FC signal is inserted into the SDH frame at the transmission side and extracted at the receiving side in accordance with both standardised and proprietary mapping procedures.
In case a WDM network is used, the FC signal is typically transmitted without any mapping (transparent transport), that is the adaptation operation simply consists of a regeneration of the signal coming from (or going to) the switch.
In order to increase the connection reliability of the network, particularly against failure and malfunctions, both SDH and WDM networks use protection techniques in which the signal is simultaneously transmitted along two separate paths between source and destination, i.e. an active (“working”) path and a backup (“protection”) path, respectively.
With reference again to FIG. 1, a WDM or a SDH/SONET network is shown connecting FC client apparatuses for exchanging information among them. Each client is connected to the network via a terminal device, such as an OADM (Optical Add and Drop Multiplexer) or a SDH/SONET ADM (Add and Drop Multiplexer) through which it is connected to the network. The network provides for two links or paths connecting the terminal devices, that are capable of switching from one path to the other, i.e. of selecting anyone of the two signals. For simplicity's sake, the information flows have been indicated in one direction only, although the connections (links) are bi-directional.
Under normal operating conditions, the terminal device switch selects the signal coming from the working path whereas in case of a failure in the working path, the signal from the “protection” path is selected.
The time required to detect the failure and to switch the path is typically of the order of a few tens of milliseconds and for such time interval the connection between the two client apparatuses (e.g. FC clients) is to all practical purposes unavailable. Moreover, because of such interruption and since the lengths of the two paths are always different, after a protection switching the client apparatus receives a signal that is not properly timed in respect to the one received just before the failure.
The above drawback afflicts both an FC link transported on a WDM link that is protected at optical level, and an FC link transported on a SONET/SDH connection and protected according to known schemes.
Therefore both in optical WDM networks and in SDH/SONET networks a path switching event causes a time interval in which the signal reaching the receiver of an FC port is absent or does not carry meaningful information.
The intervention of a protection scheme along the optical path of a FC link can cause a number of problems to the FC receiver, relating, in particular, to the FC-0 and FC-1 layers.
Two of such problems are considered with more details in the following.
The optical signal received by the FC port undergoes a continuity interruption. As a consequence the receiver detects a LOS (Loss Of Signal) that drives it, according to the FC standard, from a “Synchronisation Acquired” state into a “Loss Of Synchronisation” state.
The same state transition may occur when receiving a sequence of error-containing transmission words
In particular such a situation occurs when the FC receiver receives an optical signal that does not carry meaningful information.
More particularly, a receiver reaches a Loss Of Synchronisation state after either receiving a LOS (Loss Of Signal) detection, or after receiving four consecutive wrong transmissions words, i.e. containing:                a transmission character (10B characters) with code violations (i.e. 10B characters that are neither included in the characters codifying data bytes nor among the 12 reserved characters for transmitting control information) in at least one of the 4 TCs forming a TW, or        a 10B character comprised in the 12 control character in the second, third or fourth position, or        a running disparity error.        
The FC port transition to a Loss of Sync state has two negative consequences:                when the link is available again, a delay occurs before the data stream can restart, as the FC port must regain the Synchronization acquired state;        when the FC link is not a point-to-point connection but rather belongs to a FC “fabric”, the temporary inactivity of the FC port due to the state modification is perceived by the FSPF (Fabric Shortest Path First) routing protocol (which is part of FC protocol) like a topological modification of the fabric requiring a reconfiguration procedure of the same and a search for a new path in the fabric in order to connect again the FC concerned ports.        
Experimental tests carried out by the Applicant have shown that in a WDM optical connection, a protection intervention with a switching time of less than 50 ms could cause a traffic interruption up to 20 seconds and longer due to that the FC ports are driven into a Loss of Synchronisation state.
Such a unnecessary traffic interruption is considered by the Applicant a very significant problem in particular for applications requiring real time consistency of data in a plurality of locations.
The same problem, verified for FC protocol, may occur for other protocols, wherein a short link interruption may cause a long traffic interruption due to network reconfiguration procedures.
The same problem, described with reference to interruptions of the signal due to a path switching event, can occur in the same way in every situation in which there is a brief and temporary interruption of the meaningfulness of the signal reaching the receiver.
FIG. 2 shows a block diagram of a generic receiver of an FC port, according to Prior Art. Such scheme illustrates the processing of the received information at layer FC-0 and at part of layer FC-1 and will be used as a reference for describing the invention.
A photodiode 12 usually implements the conversion from the optical signal S1 coming from the fibre to an electric signal. The serial bit stream at the output of the photodiode 12 is led as an input to a PLL device 13 to recover the bit synchronism signal from the received signal by frequency and phase locking with a local oscillator.
Both the received data stream and the bit synchronism are input to a serial-parallel converter 14 that converts the serial data stream into a 10 bit parallel stream by using the bit synchronism signal. The output of converter 14 comprises a 10-bit transmission character and a synchronism character signal having a frequency (rate) equal to 1/10 of the bit synchronism signal frequency. The 10-bit parallel signal at the output of the converter is sent to a 10B/8B decoder 15 which converts the data characters into the corresponding byte, and is detects the “comma” characters and the code violation events.
The 8-bit parallel stream at the output of the decoder 15, together with the byte synchronism signal and signal acknowledging the receipt of the comma character, are sent to a byte-word demultiplexer 16 which provides a 32-bit parallel signal and a word synchronism signal at its outputs.
An FC transmitter, not shown in the drawings, provides for components performing the reverse functions, namely a multiplexer, a coder, a serial-parallel converter and a converter.
Such a receiver, according to the known network architectures, undergoes all previously described problems.
US 2003/0076779 A1 discloses an apparatus for detecting and suppressing corrupted data frames transported from a protected SONET network using an encapsulation/de-encapsulation technique allowing the passage through the SONET link of FC data frames only and blocking those carrying control information.
Applicants remark that the solution according to US 2003/0076779 A1 is only able to remove some type of corrupted data frames in order to maintain the integrity of the flow control mechanism based on the buffer-to-buffer credit count as in the FC standard and is not directed to solve the above mentioned problem.
U.S. Pat. No. 6,600,753 discloses a method for sanitizing a data frame in a Fibre Channel Arbitrated Loop by replacing the first transmission word found to be erroneous in a data stream with an ordered set of the EOF type and replacing the remaining of the frame with fill (buffering) frames. The remaining FC traffic (control and IDLE frames) is left unchanged even if it contains errors.
Applicants remark that said document does not teach how to prevent the interpretation of a short link interruption, e.g. an interruption that can be generated by a protection intervention in the transport network, as an error conditions at layers FC-1 and FC-2.