1. Field of the Invention
This invention relates to a method and apparatus for transmitting or storing digital data, and more particularly to an improved method and apparatus for converting digital symbols of a fixed length into symbols having a greater fixed length for an error correction code circuit.
2. Description of Related Art
Due to the widespread use of digital data, many people have attempted to devise methods for improving the reliability of digital data transfers. One well-known method for improving the reliability of digital data transfers is to create and append Error Correction Codes (ECCs) to the data that is transferred. In some error correction schemes, ECCs are associated with symbols formed from groups of data bits. Errors in the received data due to distortion during transmission of the symbols can be corrected, thereby allowing recovery of the symbols, or bits of symbols, that would otherwise be lost.
One such error correction code is known as a Reed-Solomon Code. Reed-Solomon Codes are widely used in the computer industry for transmission of digital data to and from data storage devices because of the error propagation properties of modulation codes commonly used for such transmissions, such as RLL. Due to these error propagation properties, most errors come in the form of short burst errors. Symbol based ECCs, such as Reed-Solomon Codes, are inherently good at correcting short burst errors.
Typically, the size of a symbol used in the computer industry is 8 bits of digital data, commonly referred to as a byte. Bytes of data are typically transferred in blocks equal in size to a sector of stored data. Sector sizes typically used in the computer industry are 512 Bytes and 1 Kbyte. For simplicity, the size of an ECC symbol has been equal to the size of the data symbols being processed in the particular computer (i.e., 8-bits). However, using an ECC symbol size of 8 bits makes it necessary to transmit more ECC code bits to correct a sector of data corrupted by a typical burst error than would be necessary if a longer ECC symbol size were used. Therefore, valuable storage capacity and transmission efficiency is lost.
For example, a Reed-Solomon Code with symbol size equal to n can have a codeword (which consists of data symbols and check symbols) of 2.sup.n -1. Therefore, a Reed-Solomon Code with a symbol size of 8 can have a codeword of 2.sup.8 -1 (255). A sector of length 1 Kbyte would require 5 such codewords, since each codeword includes only 255 8-bit data symbols, including the check symbols (i.e., each codeword includes less than the 255 data symbols required to transmit a sector of 1,024 bytes using only four codewords). Each codeword is interleaved with the other codewords to provide correction capability for the greatest number of consecutive errors. This is referred to as an interleaving degree of 5. The number of codewords required is determined by the number of data symbols to be protected, while the degree of interleaving is determined by the number of consecutive bits expected to be corrupted in the worst case. For example, if a relatively small number of bits are expected to be corrupted, the interleaving degree can be relatively low. However, if the total number of data symbols expected to be corrupted is large, the degree of interleaving must be greater.
Having an increased interleaving degree provides longer burst correcting capability. However, use of five codewords significantly increases the number of ECC check symbols required and decreases the reliability of the data (i.e., a higher probability of miscorrection). The length of the error burst which typically occurs in transmissions of data to a storage device is known in the art to usually be short enough that an interleaving degree of 2 is sufficient to correct all the errors which might occur. Therefore, it is inefficient to create a sector of data which has an interleaving degree greater than 2.
One way to deal with the desire to transfer a 1 Kbyte long sector of data having an interleaving degree of 2 is to increase the size of the Reed-Solomon symbol. A Reed-Solomon symbol of size 9, for example, would have a codeword of a maximum length equal to 2.sup.9 -1=511 symbols. Therefore, a 1 Kbyte sector could be transferred with an interleaving degree of 2. The number of information words is D.times.(2.sup.n -1)-C; where D=the interleaving degree, n=the size of the Reed-Solomon Code, and C=the number of check symbols. For example, if the interleaving degree is 2, and the number of bits in each data word is 9, then the number of 9-bit data symbols is (2).times.(511)-C. Because there are 1,022 9-bit symbols (9,198 bits), there are a sufficient number of bits to transmit the 1,024 8-bit data bytes (i.e., 1 Kbyte sector of data), and also transmit the number of 9-bit check symbols required to correct for burst errors of an expected duration (i.e., 1,024 8-bit bytes require 8,192 bits, leaving 1,006 bits available for check symbols).
However, use of a Reed-Solomon Code symbol that is longer than the standard byte of data which is typically handled by a computer creates difficulties. For example, typically the incoming data is received along with a byte clock that clocks in an 8-bit byte. Therefore, a new clock must be synchronized to the new symbol size. Also, almost all standard circuitry (such as internal data buses) is physically configured to process data in 8-bit bytes. Therefore, transferring 9 bit symbols presents a problem.
One means to solve the problems is to add "dummy" characters to each 8-bit byte to increase the symbol size, and strip the dummy character upon decoding the transmission. Obviously, this solution would make the transmission of the sector substantially slower. It should also be noted that the relative order of the bits cannot be disturbed, since it is only by maintaining the relative order of the bits within a symbol, and the symbols within the codeword, that the Reed-Solomon method of error correction can determine the particular bits in error. Therefore, solutions which distribute bits of one symbol among other symbols, and in so doing disturb the order of the bits, make it difficult to correct errors.
A method for converting a symbol of length m into a symbol of length n, where n is greater than m, is described in a copending U.S. application for patent, Ser. No. 07/942,587, entitled "Method and Apparatus for Initializing an ECC Circuit", assigned to the assignee of the present invention, and incorporated herein by reference. The method described in that application requires a staging register, a number of multiplexers, a modulo-x counter, and an output clock generator circuit. When the symbol converter is used to generate symbols having a length n to be stored on a data storage device, the staging register receives a number of input symbols of length m from a storage device controller input circuit in m-bit parallel format. The input symbols are received one at a time and are clocked into the staging register by a clock associated with the input symbol. Each bit of the input symbol is stored in a register cell of the staging register. The input symbols are also coupled from the storage device controller input circuit to a storage device controller output circuit. As the storage device controller output circuit receives each symbol from the storage device controller input circuit, the input symbol is converted to a serial format and transmitted to the storage device for storage therein.
The cells of the staging register within the symbol converter are divided into stages, each stage including m cells. The cells of the staging register are numbered 1 through y; where y=(a).times.(m), m=the length of the input symbol, and a=the number of stages in the staging register. When a new input symbol is loaded into the staging register, the contents of cell x are moved to cell x-m, i.e., moved from a first stage to a second stage of the staging register. The new input symbol is loaded into the cells of the first stage of the staging register (i.e., the m cells with the highest numerical designations).
N multiplexers correspond to each of the n bits of the output symbol. Each multiplexer selects the contents of one of m cells of the staging register to form an output symbol. The particular cells from which the multiplexer can choose are a function of the lengths of the input symbol and the output symbol. The modulo-x counter counts the incoming input symbols. The output of the counter determines which cell each multiplexer selects. The modulus of the modulo-x counter is determined based upon the lengths of the input and output symbols.
The bank of multiplexers creates a moving window into the staging register. FIG. 1 illustrates how the window is moved up through the staging register by the modulus-x counter as each new input symbol is loaded into the staging register. The first output symbol consists of all the bits of the first input symbol, plus the first n-m bits of the second input symbol. The second output symbol consists of the remaining bits of the second input symbol (which are shifted into the second stage of the staging register) plus those bits of the third input symbol (which are loaded into the first stage of the staging register) which are required to create the n-bits of the second output symbol. Each subsequent input symbol is loaded into the first stage of the staging register. When the input symbols are loaded, each previously loaded input symbol is shifted to the next stage of the staging register. As n cells of the staging register are filled with the bits from the input symbols, the multiplexers, under the control of the modulo-x counter, select the contents of the appropriate n cells of the staging register to generate output symbols.
In such a converter, the modulo-x counter is initialized to zero. Therefore, when the number of bits in the input block is not an integer multiple of the output symbol size, "pad bits" must be inserted in the last output symbol so that it will have a proper output symbol length. For example, if the input symbol size is 8, and the output symbol size is 9, and the input block consists of 12 symbols, there are 12.times.8=96 bits of information in the input block. To transmit all 96 bits as 9-bit output symbols requires eleven, 9-bit output symbols. Since eleven, 9-bit output symbols have a total of 99 bits, 3 pad bits must be inserted into the output block. Since the modulo-x counter was initialized to zero, the first window will include all the bits of the first input symbol (i.e., 8 bits) and the first bit of the second input symbol. The next output symbol generated will consist of the 7 remaining bits of the second input symbol and the first 2 bits of the third input symbol, etc. When the 11th output symbol is generated, it will consist of the remaining 6 bits from the 12th input symbol. Therefore, 3 pad bits will be required to complete the 11th 9-bit output symbol. The output symbols are coupled to an ECC circuit which generates check symbols. The check symbols are appended to the data that is coupled from the storage device controller input circuit to the storage device controller output circuit.
Because the output of the ECC circuit is not complete until the 3 pad bits have been coupled to the ECC circuit, and because the data coupled from the storage device controller input circuit to the storage device controller output circuit is transmitted to the data storage unit as it is received, there is a delay between the time the last bit of the last data symbol is sent from the storage device controller output circuit to the data storage device and the time the ECC check symbols are complete. This delay is equal to the time required to input the pad bits to the ECC circuit. Therefore, since the data and the check symbols must be written as a contiguous stream of data bits to the data storage device, null data is output by the storage device controller output circuit while the storage device controller output circuit waits for the ECC check symbols to be completed. The null data corresponds to the pad bits being input to the ECC circuit. Hence, the pad bits are, in effect, written to the storage device as part of the data field, thereby taking up storage space which could be used to store bits which carry information.
The amount of space lost can become significant. For example, in a disk drive storage device which is capable of storing 100 Mbytes of data, organized in 512 byte sectors, there are 200,000 sectors. If each sector had the maximum number of pad bits (i.e., in a system having an 8-bit to 9-bit converter, 8 pad bits would be the maximum) 200,000 bytes of storage space are lost.
Therefore, there is a need for a method and apparatus for converting symbols of length m into symbols of length n, where n is greater than m, without the need to store extra pad bits which carry no useful information. The present invention provides such a method and apparatus.