The present invention relates to encoding and decoding data in communications systems and more specifically to communications systems that encode and decode data wherein output symbols with high weights are generated using basis symbols.
In forward error correcting codes, such as erasure codes, an encoded output symbol is generated based on a set of input symbols. The size of the set of input symbols for a particular encoded output symbol is referred to as the weight of the encoded output symbol. Generally, the higher the weight of an encoded output symbol, the more computationally expensive it is to generate that encoded output symbol.
Some error correcting code systems vary the weight of encoded output symbols over a large range of weights. Examples of such systems include chain reaction coding systems, which are described in Luby I and Luby II. In one embodiment, for example, an encoder generates an output symbol from a key, I, of the output symbol, where the number of possible keys, and therefore output symbols, is much larger than the number of input symbols. The encoder determines a list, AL(I), of W(I) input symbols to be associated with the output symbol and calculates a value, B(I), for the output symbol. The weight W(I) of an output symbol may vary for different values of the key I.
Luby I and Luby II describe various methods and apparatus for calculating W(I) from I, calculating AL(I) from I and W(I), and generating B(I) from one or more of AL(I), W(I) and I. The decoder receives output symbols, and when sufficient output symbols are received, the decoder can calculate values for input symbols from the values of the output symbols. This process is referred to herein as xe2x80x9cchain reaction codingxe2x80x9d because once a decoder decodes an input symbol, that result can be used in combination with information about other received output symbols to decode more input symbols, which in turn may lead to an ability to decode even more input symbols.
Where the encoder and decoder are optimized, the decoder can entirely recover an input file of K input symbols from a received set of K+xcex1 output symbols most of the time. In one decoding process, the decoder waits for receipt of K+xcex1 output symbols. In many communications systems, the channel is not perfect, so some output symbols may have been lost. In general, the decoder does not assume that it has received a specific or contiguous K+xcex1 output symbols, but instead operates on the assumption that the particular K+xcex1 output symbols are randomly distributed among the possible output symbols. In a decoding process, the decoder uses the K+xcex1 output symbols to attempt to decode as many of the input symbols as possible. With the proper selection of xcex1, most of the time, the decoder will decode all of the input symbols before using all of the K+xcex1 output symbols. In the ideal case, the decoder just finishes recovering all of the input symbols when it runs out of output symbols, so that no output symbols are wasted and no more output symbols are needed. Of course, since the decoder cannot control which of the possible output symbols are going to be received at the decoder, some transmissions of K+xcex1 output symbols will include more output symbols than needed and some transmissions of K+xcex1 output symbols might not be enough to decode all of the input symbols, in which case the decoder would take additional output symbols from the channel to complete the decoding process.
In chain reaction coding, for example, a relatively small number of the output symbols may have a high weight. For instance, some output symbols may have weights in the hundreds or thousands of input symbols, or even as high as K. The generation of these high weight output symbols tends to slow down the overall performance of the system. If this is not acceptable, then the cost of the system may be increased in order to handle the computationally expensive generation of these relatively few high weight output symbols.
According to the invention, a method of generating output symbols is provided. Each output symbol is selected from an output alphabet and each output symbol is such that an input file, comprising an ordered plurality of input symbols each selected from an input alphabet, is recoverable from a set of such output symbols. The method comprises generating a plurality of basis elements, wherein each basis element is generated from a predetermined function of associated input symbols associated with the basis element. The method also comprises, for each output symbol, determining a set of associated basis elements associated with the output symbol, and determining a set of direct associated input symbols directly associated with the output symbol. The method further comprises, for each output symbol, generating the output symbol from a predetermined function of the associated basis elements and the direct associated input symbols. For some output symbols, the set of associated basis elements might be the null set and for some other (or even all) output symbols, the set of direct associated input symbols might be the null set.
In another aspect of the invention, a data signal is provided. The data signal comprises output symbols. Each output symbol is selected from an output alphabet and each output symbol is such that an input file, comprising an ordered plurality of input symbols each selected from an input alphabet, is recoverable from a set of such output symbols. The output symbols are generated using a method that includes generating a plurality of basis elements, wherein each basis element is generated from a predetermined function of associated input symbols associated with the basis element. The method also includes for each output symbol, determining a set of associated basis elements associated with the output symbol, and determining a set of direct associated input symbols directly associated with the output symbol. The method additionally includes, for each output symbol, generating the output symbol from a predetermined function of the associated basis elements and the direct associated input symbols.
In yet another aspect of the invention, a method of decoding an input symbol from output symbols is provided. The output symbols are selected from an output alphabet and output symbols are such that an input file, comprising an ordered plurality of input symbols each selected from an input alphabet, is recoverable from a set of such output symbols. Also, each output symbol has a value calculated from one or more direct associated input symbols and/or one or more associated basis elements. The method includes, for each output symbol, determining a set of direct associated input symbols directly associated with the output symbol, and determining a set of indirect associated input symbols indirectly associated with the output symbol. The method also includes, for each output symbol, determining a weight of the output symbol, and initializing a count of the output symbol to the weight of the output symbol. The method additionally includes recovering input symbols using output symbols that have a current count less than or equal to a recovery value, and reducing the count of an output symbol of which a recovered input symbol is an associate to reflect the recovery of the associated input symbol.
One advantage of the present invention is that high weight output symbols may be generated with less computation and with less memory access when the high weight output symbols are generated using basis elements. Another advantage of the present invention is that high weight output symbols may be used in the decoding process with less computation and less memory access when the high weight output symbols are generated using basis elements.