Failure and error protection techniques are used in a number of communication and storage systems. Failure protection is used to mask the failure of an individual component, by providing other means of regenerating the data stream that was handled by the failed component. Error protection, on the other hand, is typically used to mask bursts of errors caused by noise in the transmission system. For example, error correction codes add one or more redundant bits to a digital stream prior to transmission or storage, so that a decoder can detect and possibly correct errors caused by noise or other interference. In a communication network, for example, failure protection typically involves sending a duplicate copy of the data being protected to the receiver. The receiver then selects the “best” copy of the signal. Unfortunately, this level of redundancy results in 50% of the network bandwidth being wasted. As well, the system does not typically take advantage of the duplicate signal to correct individual errors in the active, or working signal.
Thus, a number of techniques have been proposed or suggested for reducing the bandwidth required for failure protection. One proposed technique employs sharing schemes, where a reserve channel is kept open for a number of working channels. When one of these working channels fails, the reserve channel is invoked and protection occurs. Bidirectional Line Switched Ring (BLSR), One for N (1:N) and Resilient Packet Ring (RPR) fall into this category
Unfortunately, the signaling and operational logistics required to implement shared protection across a real network are prohibitive, and again none of these techniques are able to offer real-time error correction on their own, which relegates these shared protection schemes to either degenerate cases, such as N+1 connections all going between two points with no intermediate nodes, individual rings, or to background optical restoration schemes in which the network is reconfigured in non-real time to deal with network outages, and instead relies on a simple high-level scheme like 1+1 protection to deal with real-time protection against failures.
A need therefore exists for methods and apparatus for improved failure correction schemes that are: bandwidth efficient, simple to implement, operational across any network, and utilize failure protection information to also correct transmission errors.