In most real signal transmission applications there can be several sources of noise and distortions between the source of the signal and its receiver. As a result, there is a strong need to correct mistakes in the received signal. As a solution for this task one should use some coding technique with adding some additional information (i.e., additional bits to the source signal) to ensure correcting errors in the output distorted signal and decoding it. One type of coding technique utilizes low-density parity-check (LDPC) codes. LDPC codes are used because of their fast decoding (linearly depending on codeword length) property.
For large block sizes, LDPC codes are commonly constructed by first studying the behavior of decoders. LDPC codes are capacity-approaching codes, i.e. these codes can approach channel capacity for standard additive white Gaussian noise (AWGN) channels.
The construction of a specific LDPC code utilizes two main techniques; pseudo-random approaches and combinatorial approaches. Construction by a pseudo-random approach builds on theoretical results that, for large block sizes, give good decoding performance. In general, pseudo-random codes have complex encoders; however pseudo-random codes with the best decoders can have simple encoders. Various constraints are often applied to help ensure that the desired properties expected at the theoretical limit of infinite block size occur at a finite block size. Combinatorial approaches can be used to optimize properties of small block-size LDPC codes or to create codes with simple encoders.
LDPC codes are linear codes with a sparse parity-check matrix. Sparse here means that the number of non-zero elements is a linear function of the size of the codewords.
It is known that decoding a LDPC code on the binary symmetric channel is an NP-complete problem. So in order to ensure fast (linear) decoding, different techniques based on iterative belief-propagation are used and give good approximations. But on the output of such iterative methods we can have words that are not codeword (because of the nature of belief-propagation, the level of noise and so on), but some other word.
An output of such iterative methods which doesn't coincide with the original codeword may still be a valid codeword. This is a very bad situation for the decoder because the decoder does not have the ability to identify the valid but erroneous word. Hereafter such a situation will be called a miscorrection.
There exists a well-known technique called Importance Sampling, which is the modification of a Monte-Carlo method for the region which has the biggest error probability. One of the applications of the Importance Sampling method for finding low error rates (having the small level of noise) is the Cole method presented in a paper by Cole et al (A General Method for Finding Low Error Rates of LDPC Codes) hereby incorporated by reference. The Cole method deals with so-called trapping sets or near codewords, i.e. some words, which are not codewords but can be converted to codewords with small effort, and leading to errors in case of small levels of noise. A trapping set is a set of variable nodes that is not well connected to the rest of the tanner graph, forming relatively isolated subgraphs, in a way that causes error conditions in the decoder. Trapping sets depend on the decoder's parity check matrix, and on the decoding algorithm.
The second step of the Cole method is used to select dominant (i.e. having more impact on probability of error) codewords and trapping sets from a list of codewords.
Unlike additive white Gaussian noise (AWGN) channel there exist a variety of other channel types with ISI (inter symbol interference) like PR (partial response) or Jitter channels. For these channels the second step of the Cole method will give significantly different estimations of error boundary distance for different random codewords. These non-stationary features of such channels require considering a set of randomly chosen original codewords. The straightforward approach is to calculate an arithmetic average error boundary distances along a big number of random codewords. It can be experimentally shown that due to the distribution of error boundary distance along all random codewords this averaging in most cases does not give a good estimation of trapping set impact on overall error probability and thus does not allow to reliably sort out dominant trapping sets. Moreover, estimating the average distance has a tendency to diverge as a number of random codewords increase.
The error floor phenomenon is related to all iterative decoding of LDPC codes. It was discovered that the error floors under message-passing iterative decoding are usually due to low-weight trapping sets rather than low-weight codewords. Another (more rare) type of errors is related to miscorrection events mentioned above.
Estimating probability of error could be made by running a direct simulation. But considering the real levels of error for high signal-to-noise ratios in modern hard disk drives, there is no possibility to get a real error probability estimation in a reasonable time.
Consequently, it would be advantageous if an apparatus existed that is suitable for efficiently estimating error probability of LDPC codes.