Viterbi detectors are used in many of today's data receivers to recover digital data from samples of a data signal having a relatively low signal-to-noise ratio (SNR). For example, Viterbi detectors are used in disk-drive read channels to recover the sequence of data values read from a magnetic disk, and are used in cell phones to recover the sequence of data values from a digitized voice signal. Basically, a Viterbi detector considers all of the possible data-value sequences that the data signal can represent and determines from the samples of the data signal which of the possible sequences is most likely to be the correct, i.e., surviving, sequence. Because the complexity of the Viterbi detector is independent of the length of the recovered sequence, it has proven to be one of the most effective circuits for recovering digital-data sequences from signals having relatively low SNRs.
Unfortunately, as discussed below, the add-compare-select (ACS) algorithm that many Viterbi detectors implement often requires fast circuitry having a relatively large number of transistors so that such a Viterbi detector does not unduly limit the rate at which a receiver can process received data. Such a Viterbi detector executes the ACS algorithm for each data-signal sample or group of data-signal samples, and must finish executing the algorithm for one sample or group of samples before moving on to the next sample or sample group. Consequently, the rate at which the receiver samples the data signal and recovers data therefrom is limited to the speed at which the Viterbi detector can execute the ACS algorithm. Unfortunately, the ACS algorithm includes a relatively large number of steps that require a relatively long time for the Viterbi detector to execute. To speed up execution of the ACS algorithm, one can design the Viterbi detector to include fast circuitry that performs many of these steps in parallel. But such circuitry typically includes a relatively large number of transistors that increase the layout area, and thus the cost, of the Viterbi detector.
And although engineers have discovered a compare-select-add (CSA) algorithm that allows a Viterbi detector to have fewer transistors than or to be faster than a Viterbi detector that executes the ACS algorithm, one cannot implement the CSA algorithm in an E2PR4 Viterbi detector.
Referring to FIGS. 1-12, an E2PR4 Viterbi detector, the ACS algorithm, and the CSA algorithm are discussed in more detail. Although this discussion does not include a general overview of the operation of a Viterbi detector, U.S. patent application Ser. No. 09/409,923, entitled “PARITY-SENSITIVE VITERBI DETECTOR AND METHOD FOR RECOVERING INFORMATION FROM A READ SIGNAL”, filed Sep. 30, 1999, includes such an overview and is incorporated by reference.
FIG. 1 is a trellis diagram 10 for a conventional one-sample-at-a-time, i.e., full-rate, E2PR4 Viterbi detector (FIG. 2) that can recover a sequence of binary values from a data signal. The E2PR4 channel is represented by the following discrete-time transfer polynomial:1+2D−2D3−D4  (1)where D represents a delay of one sample period, D3 represents a delay of three sample periods, and D4 represents a delay of four sample periods. Therefore, the sample Yk of a data signal at a sample time k has an ideal (no noise) value that is given by the following equation:Yk=Xk+2Xk−1−2Xk−3−Xk−4  (2)where Xk is the binary value of the data signal at sample time k, Xk−1 is the binary value at sample time k−1, etc. Because each sample Y is calculated from four binary values X, the sequence of binary values X has one of 42=16 potential states S0-S15 for each sample time k. Two respective branches 20 (e.g., 20a, 20b, 20c, and 20d) originating from two states S prior to sample time k each terminate at respective states S after the sample time k. For example, the branches 20a and 20b originate at S0 and S8 prior to time k, respectively, and terminate at S0 after time k. Table I includes the ideal sample values Y and the L2 branch metrics as a function of Y for each of the branches 20.
TABLE IBranch 20Ideal Sample Value YL2 Branch MetricS0 to S0+00S0 to S1+11 − 2YS1 to S2+24 − 4YS1 to S3+39 − 6YS2 to S400S2 to S5+11 − 2YS3 to S6+24 − 4YS3 to S7+39 − 6YS4 to S8−24 + 4YS4 to S9−11 + 2Y S5 to S1000 S5 to S11+11 − 2Y S6 to S12−24 + 4Y S6 to S13−11 + 2Y S7 to S1400 S7 to S1511 − 2YS8 to S0−11 + 2YS8 to S100S9 to S2+11 − 2YS9 to S3+24 − 4YS10 to S4 −11 + 2YS10 to S5 00S11 to S6 +11 − 2YS11 to S7 +24 − 4YS12 to S8 −39 + 6YS12 to S9 −24 + 4YS13 to S10−11 + 2YS13 to S1100S14 to S12−39 + 6YS14 to S13−24 + 4YS15 to S14−11 + 2YS15 to S1500
FIG. 2 is a block diagram of a conventional full-rate E2PR4 Viterbi detector 40 that operates according to the trellis diagram 10 (FIG. 1) and that includes an add-compare-select unit (ACSU) 42 for implementing the ACS algorithm. In addition to the ACSU 42, the detector 40 includes a branch-metric unit (BMU) 44 and a survivor-memory unit (SMU) 46. The BMU 44 receives the samples Yk—a finite-impulse-response (FIR) filter (not shown) may process these samples before the BMU receives them—and calculates the L2 branch metrics (Table I) for the branches 20 (FIG. 1). Next, the ACSU 42 adds the branch metrics to the respective path metrics stored in the SMU 46 to update the path metrics. Then, for each potential state S0-S15, the ASCU 42 compares the updated path metrics of the two paths terminating at each state S and selects as the surviving path to S the path having the smallest updated path metric. This adding, comparing, and selecting are the general steps of the ACS algorithm discussed above. Next, for each state S0-S15, the SMU 46 stores the respective surviving path and its path metric. The Viterbi detector 40 repeats this process for each subsequent sample Yk. After a predetermined latency, the surviving paths of all the states S0-S15 converge to a single path that the SMU 46 provides as the binary values recovered from the sampled data signal.
Still referring to FIG. 2, the ACSU 42 typically includes relatively large number of transistors, and thus occupies a significant area of the integrated circuit (not shown) that includes the E2PR4 Viterbi detector 40. Because the tasks that the BMU 44 and SMU 46 implement are relatively simple, the BMU and SMU typically include relatively few transistors, and thus occupy a relatively small area of the integrated circuit. Conversely, as discussed above, the ACS algorithm is relatively complex. Consequently, to avoid becoming the “bottle neck” of the Viterbi detector 40, the ACSU 42 typically includes relatively fast circuitry so that it can execute the ACS algorithm in the same or approximately the same amount of time that it takes the BMU 44 and the SMU 46 to perform their respective tasks. But to make the ACSU 42 fast, one typically designs the ACSU circuitry to execute operations in parallel. Unfortunately, such processing typically requires a relatively large number of transistors.
FIGS. 3-12 illustrate the derivation and implementation of a compare-select-add (CSA) algorithm, which allows one to replace some Viterbi detectors' ACSU with a CSA unit (not shown) that is faster than and/or has significantly fewer transistors than the ACSU 42. The CSA algorithm is further discussed in U.S. Pat. No. 5,430,744, which is incorporated by reference.
Unfortunately, there is no such CSA unit available to replace the ACSU 42 of the full-rate E2PR4 Viterbi detector 40 (FIG. 2).
FIGS. 3 and 4 illustrate the derivation of the CSA algorithm from the distributive law of mathematics.
Referring to FIG. 3, two branches 70 and 72 terminate at state S and have path metrics M and N and branch metrics m and n, respectively. As discussed above in conjunction with FIG. 2, the ACSU 42 calculates M+m and N+n, compares M+m to N+n to determine which is smaller, and then selects the smallest as the surviving path metric and selects the corresponding path as the surviving path. Therefore, a branch 74 that originates from the state S has a path metric Q=min(M+m, N+n).
Referring to FIG. 4, the distributive law allows one to subtract the same value from each of the branch metrics m and n and add this same value back to the path metric Q to achieve the same result as in FIG. 3. For example, a modified Viterbi detector (not shown) subtracts z from the branch metrics m and n. The Viterbi detector calculates M+m−z and N+n−m to update the path metric, compares M+m−z to N+n−z to determine which is smaller, and then selects the smallest as the surviving path metric S and selects the corresponding path as the surviving path. Therefore, the branch 74 would have a path metric Q=min(M+m−z, N+n−z). But adding z to Q yields Q=min(M+m, N+n), which is the same result as in FIG. 3.
Still referring to FIG. 4, by choosing z appropriately, one can reduce the complexity of a Viterbi detector significantly by effectively converting its ACSU into a compare-select-add unit (CSAU) (not shown in FIG. 4). The “trick” is to select z so that the modified branch metrics m−z and n−z are constants. As long as the modified branch metrics are constant, their addition to the path metrics M and N can be hardwired into the CSAU, which simplifies the circuitry. Consequently, the CSAU can compute M+m−z and N+n−z with an implicit hardwired adding step, compare M+m−z and N+n−z, and then add z back to the minimum of M+m−z and N+n−z.
FIGS. 5-10 illustrate how one can apply the distributive law discussed above in conjunction with FIGS. 3-4 to a simple butterfly trellis so that he can simplify a corresponding Viterbi detector by replacing its ACSU with a CSAU.
FIG. 5 is a conventional butterfly trellis 80 having four branches 82 (e.g., 82a, 82b, 82c, and 82d) per sample time k. The branches 82 have respective branch metrics ak, bk, ck, and dk.
FIG. 6 is a split-state butterfly trellis 90, which, as will become more evident below, corresponds more closely to the CSA algorithm than the trellis 80 of FIG. 5. To derive the trellis 90 from the trellis 80, one first splits each state S0 and S1 into two nodes 91 (e.g., 91a, 91b, 91c, and 91d) connected by a branch 92. Then one shifts the trellis so that the branches 92 (e.g., 92a, 92b, 92c, 92d, 92e, and 92f) are aligned with the sampling times k. This splitting of the states and shifting of the trellis reflects that the addition step of the CSA algorithm occurs after the comparing and selecting steps. To distinguish the branches 82 from the branches 92, the branches 82 and 92 are called inner and outer branches, respectively.
FIGS. 7-9 illustrate the step-by-step application of the distributive law of mathematics (FIGS. 3-4) to the trellis 90 (FIG. 6) to generate modified branch metrics that allow a Viterbi detector to include a CSAU instead of an ACSU. Because application of the distributive law effectively moves branch metrics from one side of a state node to the other side, modifying the trellis 90 in such a manner is called branch shifting. For example, in FIG. 8, ak is shifted from the branches 82a and 82b to the branch 92a. To accomplish this shift, one adds ak to the branch metric of the branch 92a and subtracts ak from the branch metrics of the branches 82a and 82b. ck is shifted to the branch 92b in a similar manner.
FIG. 10 is the resulting branch-shifted trellis diagram 90. As stated above in conjunction with FIGS. 3 and 4, one can significantly simplify the CSAU if the modified branch metrics for the inner branches 82 are constants. Here, the modified branch metrics for the branches 82a-82c equal zero, so one can simplify the CSAU if the modified branch metric for the branch 82d, ak−bk−ck+dk, is a constant.
Referring to FIGS. 11 and 12, the branch-shifted trellis 90 (FIG. 10) gives the same data-recovery results as the trellis 80 (FIG. 5). Specifically, the path metrics PMXn and PMYn of respective converging paths 100 and 102 through the trellis 90 of FIG. 12 have the same relationship to one another as do the path metrics PMXo and PMYo of the same paths 100 and 102 through the trellis 80. That is, if PMXo>PMYo, then PMXn>PMYn, and if PMXo<PMYo, then PMXn<PMYn. As long as this relationship is retained, PMXn need not equal PMXo, and PMYn need not equal PMYo for a Viterbi detector that includes a CSAU to operate properly. Furthermore, the branches 82 and 92 that do not lie along the paths 100 and 102 are omitted from FIGS. 11-12 for clarity.
Referring to FIG. 11, using the branch metrics of FIG. 5 and assuming that the path metrics PMXo and PMYo for the paths 100 and 102 have values of zero prior to sample time k, PMXo and PMYo at the convergence state 104 are given by the following equations:PMXo=bk+ck+1+ak+2+bk+3+dk+4  4)PMYo=ck+bk+1+dk+2+ck+3+bk+4  5)
Similarly, referring to FIG. 12, using the modified branch metrics of FIG. 10 and assuming that the path metrics PMXn and PMYn for the paths 100 and 102 have values of zero prior to sample time k, PMXn and PMXn at the convergence state 104 are given by the following equations:PMXn=ak−ak+bk+ck+1+ak+2+ak+3−ak+3+bk+3+ck+4+ak+4−bk+4−ck+4+dk+4  6)PMXn=ck+ak+1−ak+1+bk+1+ck+2+ak+2−bk+2−ck+2+dk+2−ak+2+bk+2+ck+3+ak+4  7)
Canceling common terms, one obtains:PMXn=bk+ck+1+ak+2+bk+3+ak+4−bk+4+dk+4  8)PMXn=ck+bk+1+dk+2+ck+3+ak+4  9)
It is well known that if A>B, then A+C>B+C, and if A<B, then A+C<B+C. Therefore, if both PMXn and PMYn respectively differ from PMXo and PMYo by the same value C, then the relationship between PMXn and PMXn is the same as the relationship between PMXo and PMYo. That is, if PMXo>PMYo, then (PMXo+C=PMXn)>(PMYo+c=PMYn). Likewise, if PMXo<PMYo, then (PMXo+C=PMXn)<(PMYo+C=PMYn). Here, referring to equations (4), (5), (8), and (9), C=ak+4−bk+4. Consequently, the branch-shifted trellis 90 preserves the relationships between the path metrics with respect to the trellis 80, and is thus mathematically equivalent to the trellis 80.
Unfortunately, the above-described branch-shifting technique does not allow one to replace the ACSU 42 (FIG. 2) of the E2PR4 Viterbi detector 40 (FIG. 2) with a smaller and/or faster CSAU. As discussed above in conjunction with FIGS. 3-10, a smaller and/or faster CSAU is typically possible only when the modified branch metrics of the inner branches 82 (FIG. 10) are constants. U.S. Pat. No. 5,430,744 discloses a branch-shifting technique that generates constant branch metrics for the inner branches of full-rate PR4 and EPR4 trellises. But unfortunately, there is no known branch-shifting technique that generates constant branch metrics for all of the inner branches of an E2PR4 Viterbi detector.