In the information transmission and processing area, multiple communication channels may be used to transmit a piece of information. The communication channels are often noisy and have a probability of incorrectly transmitting a data bit, such a probability being referred to as a “probability of error.” That is, with an input of binary data 1, a communication channel may output an erroneous binary data 0, and vice versa. Similarly, in the data storage area, multiple storage cells are used to store data. Due to noise or external disturbance, a data bit stored in a storage cell may be changed, so that the data bit read from the storage cell is not the same as the data bit written into the storage cell. The probability that the stored data bit is changed is also referred to as a “probability of error.”
To reduce error in the transmission or storage of information/data, and thereby reduce the probability of error, the information/data to be transmitted or stored is usually encoded by an error correcting method before being transmitted. Hereinafter, both information/data transmission and storage are collectively referred to as information transmission to simplify description. Thus, unless otherwise specified, “information transmission,” “transmitting information,” or similar phrases should be understood to mean “information/data transmission and/or storage,” “transmitting and/or storing information/data,” etc. Further, information to be transmitted is also referred to as “information” to simplify description, unless otherwise specified. As an example of coding information, bits of the information and several frozen bits are encoded to form encoded bits, which are then transmitted through communication channels or stored in storage cells. Such coding can be considered as a transformation of an input vector, which consists of the bits of the information and the frozen bits, by a generator matrix to an output vector, which consists of the encoded bits to be transmitted through the communication channels or stored in storage cells. Each input bit corresponds to a bit-channel of such transformation, and each bit-channel has a corresponding probability of error.
Polar coding is a type of linear block error correcting coding method that can “redistribute” the probability of error among the bit-channels. After polar coding, some bit-channels have a lower probability of error than other bit-channels. The bit-channels having a lower probability of error are then used to transmit the information, while other bit-channels are “frozen,” i.e., used to transmit the frozen bits. Since both the sender side and the receiver side know which bit-channels are frozen, arbitrary data can be allocated to the frozen bit-channels. For example, a binary data 0 is allocated to each of the frozen bit-channels.
However, the construction of polar codes (the codes for polar coding) imposes certain restrictions on the code length of a conventional polar code. In the present disclosure, the conventional polar code is also referred to as a “standard polar code.” Correspondingly a polar coding scheme using a conventional polar code is also referred to as a “conventional polar coding scheme” or a “standard polar coding scheme.” More particularly, the conventional polar coding scheme limits the code length to a power of 2, i.e., 2n, where n is a positive integer. This introduces an additional complexity into a system employing polar coding. One solution to this problem is dividing information being encoded into segments having an appropriate length to fit the coding scheme, to create length-compatible polar codes.
Exemplary approaches to creating length-compatible polar codes include, for example, puncturing and shortening. Both approaches achieve an arbitrary code length by cutting code length from an original length of 2n so that some bits are not transmitted. However, as the code length is shortened from a length of 2n, an error-correcting performance loss as measured by, e.g., bit error rate, BER, or frame error rate, FER, of the code increases. FIG. 1 schematically shows the relationship between the code length of a code and the performance loss of the code in the puncturing or the shortening approach. In FIG. 1, a higher degree of gray indicates a more severe performance loss. As shown in FIG. 1, when the code length equals a power of 2, there is no performance loss. When the code length decreases from a power of 2, the performance loss increases.
However, such exemplary approaches are not suitable for application in certain scenarios, such as data storage in a memory device. This is because, for example, in a memory device, data is usually stored in units each having a size that is a multiple of 8, such as 1024, and adding a small number of frozen bits to each block coding makes the code length slightly larger than 2n. In this scenario, the puncturing or the shortening approach will result in a severe performance loss as shown in FIG. 1.