1. Field of the Invention
This invention relates generally to the field of digital signal processing and more particularly to signal decoders designed to store selected sequences from which a decoded sequence is ultimately retrieved.
2. Background of the Invention
Communications systems such as High Definition Television (HDTV) employ trellis encoding to protect against interference from particular noise sources. Trellis coding requirements for HDTV are presented in sections 4.2.4–4.2.6 (Annex D), 10.2.3.9, 10.2.3.10 and other sections of the Digital Television Standards for HDTV Transmission of Apr. 12, 1995 prepared by the Advanced Television Systems Committee (ATSC). The trellis decoder selects a received symbol sequence as the most likely to be correct, that is, the survivor sequence, according to a signal processing algorithm. The most popular trellis decoding algorithm is the Viterbi algorithm, as described in the paper entitled Convolutional Codes and Their Performance in Communication Systems, by A. J. Viterbi, published in the I.E.E.E. Transactions on Communications Technology, vol. COM-19, 1971, pp. 751–772. In the Viterbi algorithm, there are two widely known techniques for the storage of the survivor sequences from which the decoded sequence is ultimately retrieved. One technique is known as register exchange and the other technique is known as traceback. The theory behind the traceback process is described in Architectural Tradeoffs for Survivor Sequence Memory Management in Viterbi Decoders by G. Feygin et al. published in the I.E.E.E. Transactions on Communications, vol. 41, no. 3, March, 1993. Although relatively simple, the register exchange method requires large power consumption and large area in VLSI implementations, and is therefore restricted to codes having small constraint length. Constraint length is defined as K=v+k, where v is the number of memory elements in the trellis encoder and the code rate is R=k/n. Thus, traceback is the preferred method in the design of moderate to large constraint length trellis decoders.
U.S. Pat. No. 5,841,478, entitled CODE SEQUENCE DETECTION IN A TRELLIS DECODER, issued Nov. 24, 1998 to Hu et al., discloses an all-path traceback network coupled to an all-path trace forward network for the selection of the survivor sequence. The described traceback process is performed to a predetermined depth T, the traceback depth or survivor memory depth, in order to identify a predetermined number of antecedent trellis states. In practice, the traceback interval T is chosen to provide a sufficient period to permit identification of a merged or converged state. The merged state identifies the data sequence with the greatest likelihood of being the true encoded data. The merged state is the trellis decoded data sequence that is selected as the final output data, chosen from among the several candidate sequences. This traceback process is performed in two stages for traceback intervals of T/2, known as epochs. The selection of such epochs or traceback subintervals is arbitrary and selectable by the system designer.
The overall memory size required in Hu et al. scheme is 3/2*T*N, where T is the predetermined survivor memory depth and N is the number of states in the trellis. In order to achieve satisfactory decoder performance, the survivor memory depth or traceback depth (or traceback interval) T is typically four to six times the code constraint length. The value of N is equal to 2v, where v is the number of memory elements in the encoder. The latency, or data decoding delay, associated with the Hu et al. algorithm is 3/2*T. While the Hu et al. device was implemented in an ATSC HDTV trellis decoder, which required twelve interleaved decoders, the disclosed technique can be applied to any trellis decoder. Unfortunately, the Hu et al. system is not the most efficient traceback algorithm, and is not as efficient as the register exchange technique with respect to memory size and data decoding delay, or latency. However, it is more efficient than the register exchange algorithm in power consumption and control complexity, as any traceback algorithm would be.
The Hu et al. all-path traceback/forward trace (APTFT) system can be described by the block diagram of FIG. 1. The data input 16 to the system consists of a trellis decoder Add-Compare-Select (ACS) unit output per trellis state and per trellis branch, that is, a pointer to the previous state in the trellis. The control inputs consist of clock, enable, reset, any sync signals, and the minimum state per trellis branch. The minimum state per trellis branch is also an ACS output which identifies the state having the minimum path metric (value) at each trellis branch. The control unit generates all of the control signals and read/write addressing of the various memory blocks.
The buffer is a Last In, First Out (LIFO) memory of size (T/2)*N, which temporarily stores the ACS output. Data is written in order of arrival, N states at a time, and is read in reverse order during the following epoch. An epoch is characterized by the size of the buffer memory in input samples (trellis branches), that is, T/2 samples. After each read operation, a new data input is written in the same location.
The all-path traceback unit is directed by the control unit to read the buffer memory from the previous epoch, in the reverse order of storage, and trace back through the trellis for an entire epoch of T/2 samples at a time. As it traces back through the trellis, the all-path traceback unit sends a decoded output to the decoded sequence memory for each of the N states in the trellis. The all-path traceback unit therefore needs N state pointers to identify the N surviving paths in the trellis. The N state pointers are updated for every branch and always point to the previous state in the corresponding path. At the same time that the all-path traceback unit is reading and processing the ACS data 16, which had been buffered on the previous epoch, the forward trace unit is tracing forward through the trellis with the ACS data 16 of the current epoch.
The activities of the various units during each new epoch are depicted in the timing diagram of FIG. 2. The input data is written into the buffer memory in normal, forward order and is passed to the all-path traceback unit in reverse order. The decoded output of the all-path traceback unit, for all of the trellis states, is then passed to the decoded sequence memory. This decoded information is read from the decoded sequence memory two epochs later in reverse order. The two reverse read operations cancel each other, causing the final decoded data to appear in the correct forward order. The two epoch delay in the decoded sequence memory unit necessarily requires a memory size of T*N.
At the end of each epoch, the path selection unit updates and freezes the value of the forward trace pointer, P, associated with the minimum state path sent by the ACS unit. This pointer is used for a period of one epoch until the next update occurs. At the boundary of an epoch, the forward trace pointer points to the minimum state path and provides the state associated with this path two epochs earlier. However, as the end of the epoch approaches, the forward state pointer points to the minimum state path at the previous epoch boundary and provides the state associated with this path three epochs earlier. The Hu et al. device actually has two internal pointers (P1 and P2) for each state path which are temporally offset from each other by one epoch. These two pointers will ultimately help identify the trellis decoded bit sequences. The pointer P1 for each state path is updated for every branch with forward trace information, while the pointer P2 is only updated once every epoch. Pointer P1 is the current epoch pointer and pointer P2 is the immediately prior epoch pointer.
Since N states have N survivor state paths, there are 2*N internal pointers in the forward trace unit. At the end of each epoch, each internal pointer points to the beginning state of the same epoch in the corresponding survivor path. These pointers contribute to create the main pointer P. At the end of an epoch, the pointer P2 receives the value of pointer P1 and then P1 is reset and initiates a forward trace through the trellis during the following epoch. The multiplexer unit uses the forward trace pointer, P, to select one of the N decoded sequences from the decoded sequence memory and to forward the selected decoded bit(s) as its output.
For example, at the end of epoch 3, the forward trace pointer, P, indicates the trellis state associated with the beginning of epoch 2 in the minimum path . The pointer P1 points from the ending state to the beginning state of epoch 3, and the pointer P2 points from the ending state to the beginning state of epoch 2. The pointer P2 is then updated with the value of pointer P1, and pointer P1 is then reset. During epoch 4, the value of pointer P is unchanged and points to the beginning state of epoch 2, and this value will be used by the multiplexer to select the appropriate decoded sequence DD1 in FIG. 2, that is being read from the decoded sequence memory out of the N possible sequences. The pointer P2 is frozen or unchanged during epoch 4 and points to the beginning state of epoch 3. The pointer P1 is continuously updated with the forward trace during epoch 4. Similarly, at the end of epoch 4, the forward trace pointer P is updated to point to the beginning state of epoch 3, pointer P2 is updated to point to the beginning state of epoch 4, and pointer P1 is reset. During the entirety of epoch 5, the pointer P selects the correct decoded sequence DD2 in FIG. 2, which is being read from the decoded sequence memory. This process continues indefinitely as long as an input signal is available to be processed.
As best understood with reference to FIG. 2, the forward trace will process data up to data D3 (data in epoch 3) in order to permit decoding of the data associated with D1, the decoded data being data DD1 (decoded data from epoch 1), which will occur during epoch 4. Therefore, three epochs are fully processed (epochs 2 and 3 by the forward trace and epoch 1 by the traceback) before the first epoch decoded data DD1 is generated as an output signal. Similarly, in order to output DD2 (decoded data from epoch 2), two epochs (epochs 3 and 4) are processed by the forward trace and one epoch (epoch 2) is processed by the traceback, the process continuing as a sliding window of three epochs at a time. This process implies that the first bit of DD1 has a corresponding survivor memory depth of three epochs or 3/2*T samples, since the first decoded bit is associated with the beginning of epoch 1 and the pointer P is associated with the end of epoch 3. In contrast, the last bit of DD1 has a survivor memory depth of two epochs, or T, since the last decoded bit is associated with the end of epoch 1 and the pointer P is associated with the end of epoch 3. Similarly, the first bit of DD2 is associated with a survivor memory depth of 3/2*T, and the last bit of DD2 is associated with a survivor memory depth of T. This guarantees that any decoded sequence block (DD1, DD2, DD3 . . . ) is associated with a survivor memory depth of at least T.
As the data is processed and decoded by the All-Path Traceback/Forward Trace(APTFT) processor, the memory size will consist of T/2*N in the buffer memory plus T*N in the decoded sequence memory, corresponding to a total of 3/2*T*N. Additionally, the Hu et al. algorithm needs a total of 3*N+1 state pointers (N in the all-path traceback unit and 2*N+1 in the forward trace unit, that is, pointer P, N pointers P1 and N pointers P2). The data decoding delay, or latency, in the Hu et al. device is attributable to a one epoch delay (T/2 samples) in the buffer memory, plus a two epoch delay (T samples) in the decoded sequence memory. The total latency is thus a three epoch delay, or 3/2*T samples.
In order to decode a sequence with a survivor memory depth of T, an efficient algorithm will have the characteristic that each bit has an associated survivor memory depth of T. Existing traceback algorithms need to decode entire data blocks per processing cycle with the result that unnecessarily large survivor memory depth exists for all but one bit in the data block. Thus a need exists for an improved trellis decoder memory management scheme in which both memory size and latency values are reduced.