1. Field of the Invention
The present invention relates to Error Correcting Codes (ECC) in general, and more particularly applies to a class of Cyclic Redundancy Check (CRC) codes such as FIRE and similar codes capable of detecting and correcting errors occurring in bursts.
2. Background of the Invention
The rate at which data are transmitted through communications networks has dramatically increased in recent years. Fueled by advancements achieved in fiber and optoelectronic devices and techniques such as Dense Wavelength Division Multiplexing (DWDM), which allow multiplication of the bandwidth of a single fiber by merging many wavelengths on it, the telecommunications and networking industry developed devices capable of routing and switching the resulting large amount of data that converge, and thus must be dispatched, at each network node. Typically, routers and switches situated at those network nodes have now to cope with the requirement of having to move data at aggregate rates that must be expressed in hundreds of giga (109) bits per second while multi tera (1012) bits per second rates must be considered for new devices under development.
Due to the considerable progress that has been made in optoelectronics which has allowed the transport of data from node to node at such high rates, it remains that switching and routing of the data is still done in the electrical domain at each network node. This is due to no optical memory available yet that would permit the temporary storage of the frames of transmitted data while they are examined to determine their final destination. The temporary storage of data must still be done in the electrical domain using traditional semiconductor technologies and memories. However, the electrical technologies based on semiconductors have not enjoyed the same level of improvement as compared to the optoelectronic ones. Especially, the transmission of signals on printed ciruit (PC) boards and backplanes suffers intrinsic limitations due to the transmission medium (PC boards), the cables and the connectors that must be used to realize the interconnections. The state of the art for an electrical link is currently a 2.5 Gbps link while 5 and 10 Gbps links are considered for future development. However, in order to reach such a transmission rate in the electrical domain, while maintaining bit error rate (BER) at a low level, transmitted data must be encoded. To this end, a so-called 8B/10B code, developed under the auspices of the American National Standards Institute (ANSI) by a Task Group X3T9.3 of the Technical Committee in 1992, has been largely adopted. However, the use of the 8B/10B code contributes to reduce the actual link bandwidth to 2 Gbps. Hundreds and even thousands of those links need to be used to concentrate and dispatch the flows of bits entering and leaving an electrical Terabit per second switching node. Actually, at least five hundred 2 Gbps links IN and five hundred 2 Gbps links OUT would be required, per Terabit, to implement a switching node. Even though BER is low, the multiplication of those links and the huge throughput handled by the switching nodes make them susceptible to frequent errors. As an example, assuming that the BER on one link can be specified to 10−15, an already exceptionally good value, then one transmission error may happen about every 8 minutes (i.e. about 500 seconds) in a Terabit switching node (2×109×1000 links=2×1012). And, because the links are encoded, more than a single transmitted bit is likely to be affected after decoding. In the 8B/10B code mentioned hereinabove, a single transmission error can thus span over 5 decoded bits.
On the other hand, the very large scale integration (VLSI) progress of semiconductor technologies, i.e. continued miniaturization, reduced voltages, increasing memory bit counts, has revealed a type of error known as a “soft error”. Soft errors are changes in the stored data of VLSI devices, that is, flip-flops, register arrays and Random Access Memories (RAMs). A soft error is caused when a high-energy particle traverses the semiconductor substrate (i.e. silicon), leaving a trail of free charges in its wake. These charges are collected in a very short time interval (about 30 ps) by logic circuitry elements. If the product of capacitance and voltage (i.e. the energy) of the circuit element is low enough, the collected charge may change the stored data. There is no permanent damage to the VLSI device. The circuit will function properly after the event; hence the name “soft error”. Radiation-induced soft errors, such as the ones induced by cosmic particles, have been known in the industry for more than 20 years, for example, occuring in dynamic RAMs. It is only recently that the problem has been recognized in VSLI devices when the progress of the integration has led to store data bits in low-energy circuits that can be more easily disturbed.
The implications of soft errors occurring in the logic of a Terabit switching node, and of the errors that may occur on the numerous electrical links necessary to implement it, is that means should be taken to protect against them to keep switching function running error-free. Error correcting codes (ECC) must thus be implemented so the data packets handled at switching nodes are protected while they traverse them.
In the realm of correcting codes, FIRE codes are burst-error-correcting codes and, thus, are well adapted to cope with the kind of errors occurring on the electrical links as described hereinabove (i.e. in bursts spanning several contiguous bits after decoding). They can also take care of the soft errors of the VLSI devices used to implement the switching function since soft errors generally affect a single bit (i.e. a binary latch or a single bit of a RAM). A description of FIRE codes can easily be found in the abundant literature on ECC. Among many examples, one can refer to ‘Error Control Coding’, a book by Shu LIN and Daniel J. Costello, Prentice-Hall, 1983, ISBN 0-13-283796-X, herein incorporated entirely by reference, and more specifically to Chapter 9 on ‘Burst-Error-Correcting Codes’.
If FIRE codes can handle the type of errors as discussed above, it remains that the correction of those errors implies the use of an ECC which must be feasible in a time compatible with the handling of data packets by a switching node. A Terabit per second class switching node of the kind considered here is concentrating and dispatching traffic through a few tenths of ports. Typically, port configurations are generally in the range of 16-port to 64-port. FIG. 1 illustrates a 16-port switch including 16 input ports 100 and 16 output ports 110. The core switching function 120 of a switching node, most often, manipulates small fixed-size data packets 130 including header 131 and data 132. A common size for a data packet is 64 bytes or 512 bits. Port speeds to consider range from 10 to 40 Gbps that is, corresponding either to an OC-192 line of the SONET hierarchy (North American Synchronous Optical NETwork; the European counterpart is called SDH, which stands for Synchronous Digital Hierarchy) for the lower value (10 Gbps) and to an OC-768 for the higher value (40 Gbps). Keeping in mind that switch ports are designed to actually sustain higher values than those quoted above (there is a speedup factor e.g., to provide for the segmentation of protocol frames in fixed-size packets), it can be seen that data packets must enter and exit switch 120 through each port at a rate of one packet every 8 nanoseconds to accommodate OC-768 communication lines with a speedup factor of 1.6 (i.e. with actual switch port speed=40×1.6=64 Gbps). This is the rate (i.e. 64 Gbps) at which ECC must be able to perform corrections in every output port adapter 140.
Very simple circuitry has long been proposed to decode FIRE and similar codes. The well-known standard technique is an error-trapping decoder, an example of which is shown in the above reference book ‘Error Control Coding’ in section 9.2. Also, improvements have been disclosed. For example, U.S. Pat. No. 5,936,978 dated Aug. 10, 1999 and titled ‘Shortened FIRE Code Error-Trapping Decoding Method and Apparatus’ describes an improved (faster) error-trapping decoder. Yet simple, error-trapping technique, including all known improvements such as the one of above mentioned patent, assumes that one can afford to shift the pattern of bits received so as to determine where the corrections to perform are (if any). Because the semiconductor technologies that can be used in practice to implement the necessary logic (i.e. CMOS) is now pushed to its limits of operation, the internal clock speed is becoming of the order of magnitude of the time left to handle a packet. Typically, the internal clock period of CMOS ASIC (Application Specific Integrated Circuits) devices can be tuned down to a 2–4 ns range for the fastest of the devices, with logic gate propagation time around 100 picoseconds, while, as stated above, the requirement is to process one 64-byte packet every 8 ns. This makes the state of the art error-trapping technique impractical to use.
Decoding FIRE and similar ECC codes that match the data packet processing speed requirement of Terabit per second switching nodes while still using relatively slow standard technologies like CMOS and not requiring any bit pattern shift is desired.