The present invention relates to a method for estimating phase error, for example, of a timing recovery loop. More specifically, a preferred embodiment relates to efficiently selecting the starting phase for a timing recovery process in a sampled data detection system, such as a PRML or EPRML channel in a magnetic recording system.
Digital mass data storage devices such as a disk drive record digital sequences onto media and retrieve them from an analog signal, unfortunately corrupted by noise from various sources. It is desired to achieve the highest recording density per unit area while attaining an acceptable probability of error between the signal recorded and that retrieved. To achieve this goal, a disk drive""s read channel uses equalization and coding prior to processing in a digital signal processor. One solution is to use run-length limited (RLL) codes with a RLL encoder and decoder.
Digital mass data storage devices use RLL codes, when applied to data only, to improve signal-to-noise ratio (SNR) or to implement frequent updates to the timing recovery and automatic gain control loops, or both. The RLL codes use two parameters, d and k, controlling the minimum and maximum number of symbol intervals between transitions in the input signal, respectively. For a given d, the RLL code dictates at least xe2x80x9cd+1xe2x80x9d, and at most xe2x80x9ck+1xe2x80x9d, symbol intervals between transitions. Conventionally used codes are those with (d, k) constraints of (1, 7) and (2,7), generally used with peak detection methods.
These asynchronous methods detect single pulses. The xe2x80x9ckxe2x80x9d constraint insures that a non-zero channel output is produced with some minimum frequency to maintain robust operation of timing recovery and AGC loops. The xe2x80x9cdxe2x80x9d constraint insures acceptable SNR with peak detection.
Conventional methods for estimating the initial phase error for ZPR have been applied to determine xe2x80x9czero-crossingsxe2x80x9d of analog, i.e., continuous, time signals. Of course, this is inappropriate for handling discrete, or digital, pulses.
Some conventional disk drives, for example, use continuous time peak detection designs to recover digital data written as a series of magnetic transitions on a recording surface of a rotating magnetic disk. A voltage controlled oscillator (VCO) uses an xe2x80x9cEnablexe2x80x9d command for controlled starting and stopping of the oscillator. When the xe2x80x9cEnablexe2x80x9d command is asserted, the VCO begins oscillating in a known state. The rising edges of the clock""s transition occur at a fixed delay interval after xe2x80x9cEnablexe2x80x9d is asserted.
In turn, xe2x80x9czero phase restartxe2x80x9d (ZPR), also sometimes referred to as zero phase start, senses a logic transition of a read gate control signal (xe2x80x9cRd Gatexe2x80x9d) from inactive to active, and disables the VCO. Upon arrival of a subsequent xe2x80x9ctransition edgexe2x80x9d at the ZPR logic, xe2x80x9cEnablexe2x80x9d is reasserted and the timing control circuit VCO is restarted. A timing delay block compensates for the delays associated with detection and restart, which results in the next transition edge and the first clock output coinciding at the input to the phase-frequency detector. Starting phase error,  is brought near zero while PLL acquisition time is reduced.
One effect that limits the recording density in mass data storage systems is intersymbol interference (ISI). ISI is endemic to the band-limited nature of the head/media combination and results in overlapping responses. That is, at a given time, the output signal contains the response due to the input signal and the responses from some previously recorded symbols. This overlap increases as recording density or disk rotation speed is increased, yielding overlap patterns that are generally very difficult to decode.
To reduce the complexity required to decode, the readback signal in the read channel is first equalized to a prescribed partial response (PR) signal. PR signals permit a controlled overlap of responses in the output signal. A priori knowledge of the xe2x80x9ccontrolled overlapxe2x80x9d significantly reduces the complexity of the required detector, compared to that required for an unequalized signal.
One commonly-used PR target signal in digital magnetic recording systems is characterized by the transfer function P(D)=1xe2x88x92D2, where D is the transform of the unit symbol delay operation. This PR signal is commonly referred to as a xe2x80x9cClass IV PRxe2x80x9d or xe2x80x9cPR4.xe2x80x9d The noise-free output response at a suitably prescribed sampling instant for PR4 is given by
Y(nT)=a(nT)xe2x88x92a[(nxe2x88x922)T]xe2x80x83xe2x80x83(1)
where:                               n          =                      xe2x80x83                    ⁢          2                ,        3        ,        …                                                      a            ⁡                          (              nT              )                                =                      xe2x80x83                    ⁢                      the            ⁢                          xe2x80x83                        ⁢            input            ⁢                          xe2x80x83                        ⁢            symbol            ⁢                          xe2x80x83                        ⁢            at            ⁢                          xe2x80x83                        ⁢            time            ⁢                          xe2x80x83                        ⁢            instant            ⁢                          xe2x80x83                        ⁢            nT                          ,                  normally          ⁢                      xe2x80x83                    ⁢          picked                                                  xe2x80x83                ⁢                              from            ⁢                          xe2x80x83                        ⁢            a            ⁢                          xe2x80x83                        ⁢            binary            ⁢                          xe2x80x83                        ⁢            alphabet                    ,                                    {                              0                ,                1                            }                        ⁢                          xe2x80x83                        ⁢            or            ⁢                          xe2x80x83                        ⁢                                          {                                  1                  ,                                      -                    1                                                  }                            .                                          
That is, the output sample at time instant, nT, involves the overlap of two input symbols, a(nT) and a[(nxe2x88x922)T].
The equalized signal is then detected using a sequence detector such as a Viterbi Detector (based on the Viterbi Algorithm). This combination of PR4 and Viterbi detection is commonly referred to as xe2x80x9cPRMLxe2x80x9d for xe2x80x9cpartial response maximum-likelihood.xe2x80x9d
To increase storage density and throughput rate, sampling techniques, such as the above Partial Response (PR) signaling and Maximum Likelihood (ML) sequence detection are used.
The choice of the PR target signal is dictated by the linear density of the recording (as well as additional functions that may be required of the system). A single system may require two different PR target signals, e.g., PR4 and EPR4. Many PR targets exist for magnetic recording and are now commonly referred to as the xe2x80x9cExtended Class IVxe2x80x9d family of PR signals. The Extended Class IV family of PR signals is defined by the polynomials P(D))=(1xe2x88x92D)(1+D)n, where n is a positive integer. Note that n=1 yields the standard PR4 signal; while n=2 yields EPR4; and n=3 yields E2 PR4, etc.
Correct operation of any PRML system depends on sampling the readback signal synchronously. Even small time shifts from the correct sampling instant act to distort the sample values. To maintain proper timing of the read data, a timing recovery circuit, often a PLL, is used to adjust the phase of a VCO based on the phase error,  determined by a digital phase error detector receiving input from an analog-to-digital converter (ADC).
During acquisition, the phase error,  is defined as being zero when the rising edge of the clock signal is aligned with the ideal sampling instances of the input signal. A non-zero phase error,  causes the error detector to send a signal to a loop filter that outputs a signal proportional to the phase error. This signal shifts the instantaneous frequency of the VCO in order to subsequently match the phase of the input signal.
A PRML read channel uses ML detectors to xe2x80x9creadxe2x80x9d data based on sampled sequences of an analog waveform read from a disk, rather than by analyzing a single peak as in conventional peak detection. These samples are obtained by using an ADC that samples and quantizes the read waveform at predetermined sampling intervals. The intervals are controlled by a clock synchronizing the ADC and the incoming signal. The clock also must be phase aligned to the incoming signal.
To achieve proper timing and phase synchronization, a conventional PRML read channel uses a timing control circuit to acquire and lock frequency and phase synchronization. The timing control circuit uses a PLL circuit to generate a phase-coherent clock so that data samples may be taken at predetermined locations along the input signal. It is necessary to first lock the PLL circuit to a reference so that the required sampling frequency can be acquired and tracked.
A phase detector processes the signal samples to calculate a phase error,  between the actual and desired signals. A compensation to this phase error is used to adjust the sampling frequency that is typically the output of a VCO, with the compensation for the phase error value as the control input. The output of the VCO controls the sampling period of an ADC. Conventionally, long acquisition times are addressed by applying a relatively large initial correction to the VCO, thus enabling a quick phase match. Subsequent phase locking can then proceed more rapidly, while avoiding the reverse-slope null.
The amount of correction is determined by sampling phase difference at start of fast acquisition. The instantaneous phase and frequency of the signal are determined by digital processing as opposed to comparing signal transition edges using peak detection. A timing recovery circuit processes the samples to estimate phase,  and frequency, {circumflex over (f)}, errors. In one application, using part digital and part analog processing, these estimates are forwarded to a timing control DAC (not shown) and converted into analog error estimates for timing recovery circuit processing.
The time it takes a timing recovery circuit to recover a synchronous data clock signal impacts both the speed of acquisition and the amount of required disk space. In conventional disk drives, when the xe2x80x9cREADxe2x80x9d mode is entered, the PLL acquires the initial data clock frequency, f, and phase, xcfx86, from a known preamble waveform, most often a sinusoid, that precedes the input signal.
By minimizing PLL acquisition time, performance and capacity are improved. Early conventional PLLs had long acquisition times, and failed to lock within a desired maximum time. This problem is identified in a paper by Floyd M. Gardner entitled xe2x80x9cHangup in Phase-Lock Loopsxe2x80x9d, EEE Transactions on Communications, Vol. COM-25, No. 10, October 1977. Gardner observes that acquisition may start around a xe2x80x9creverse-slope nullxe2x80x9d, i.e., a metastable point where the initial phase difference is halfway between two stable phase-locked operating points. In this instance, acquisition may take additional cycles. Further, the presence of non-negligible noise can exacerbate this.
There are other methods for acquiring and tracking a sampling frequency. Timing recovery methods for synchronous data receivers have been investigated by K. H. Mueller and M. Mxc3xcller, xe2x80x9cTiming recovery in digital synchronous data receivers,xe2x80x9d EEEE Trans. Commun., Vol. COM-24, pp. 516-530, May 1976, incorporated herein by reference. Specifically, for PRML it has been proposed by F. Dolivo, W. Schott, and G. Ungerboeck, xe2x80x9cFast timing recovery for partial response signaling systems,xe2x80x9d Int. Conf. Commun. ""89, ICC""89, Boston, Mass., June 1989 (incorporated herein by reference) to update the timing phase at time instant nT using the timing gradient defined as:
(error)=xe2x88x92ynxc3x97xnxe2x88x921+ynxe2x88x921xc3x97xnxe2x80x83xe2x80x83(2)
where:                               y          n                =                  the          ⁢                      xe2x80x83                    ⁢          sampled          ⁢                      xe2x80x83                    ⁢          value          ⁢                      xe2x80x83                    ⁢          at          ⁢                      xe2x80x83                    ⁢          n                                                  x          n                =                  the          ⁢                      xe2x80x83                    ⁢          ideal          ⁢                      xe2x80x83                    ⁢          sample          ⁢                      xe2x80x83                    ⁢          value          ⁢                      xe2x80x83                    ⁢          that          ⁢                      xe2x80x83                    ⁢          is          ⁢                      xe2x80x83                    ⁢          closest          ⁢                      xe2x80x83                    ⁢          to          ⁢                      xe2x80x83                    ⁢                                    y              n                        .                              
In a PRML channel, xn is restricted to the values +1, 0, or xe2x88x921. Once  is obtained, a PLL circuit is used to recover the sampling clock. The PLL operation is divided into two stages, acquisition and tracking.
Conventionally, a sinusoid is written onto the disk, usually at the start of each track. It is known as the preamble. Following the preamble on the disk is the user data that is sampled in tracking mode. The preamble is read during the acquisition mode and data are read during the tracking mode. Since the sampling clock should be at the ideal sampling frequency and phase after acquisition, the PLL""s bandwidth is lowered in the tracking mode to reduce timing jitter.
These designs use a timing gradient (TG) calculated using actual signal samples and estimated signal samples obtained from symbol-by-symbol decisions. See xe2x80x9cTiming Recovery in Digital Synchronous Receiversxe2x80x9d by K. H. Mueller and M. Mxc3xcller, supra.
One inherent drawback of these designs is that during acquisition the sampling point may occur at the point halfway between the desired sampling times. Consequently, the method for correcting the phase may reverse its direction of adjustment several times in the vicinity of this metastable equilibrium point for an extended period of time. Although this xe2x80x9chang-upxe2x80x9d effect does not frequently occur, the length of the acquisition preamble must be sufficiently long so that the system may still synchronize given this situation. A long preamble, however, reduces the total amount of storage space available for user data.
A further concern is the non-linear characteristic of the timing gradient circuit when tracking random user data. Because the method for calculating the timing gradient is based on approximating the slope of the pulses, the gain of the timing gradient circuit varies when tracking random user data due to inconsistent pulse slopes. This variation in gain results in less than optimum timing recovery.
Another approach uses a single xe2x80x9cTG circuitxe2x80x9d to gather a rough estimate of the ideal sampling instances. This yields ZPR samples that may be metastable. Further, if these samples are averaged to reduce noise contributions, and, if xe2x80x9changupxe2x80x9d is to be avoided, a hysteresis effect must be introduced in order to reduce the probability of reversals in the once chosen direction of timing and phase adjustment. Having this additional function, i.e., the introduction of hysteresis, to address further complicates the solution and also reduces performance by increasing latency.
A method for avoiding the xe2x80x9chang-upxe2x80x9d effect in order to reduce the preamble length has been perfected. With this method, a sliding threshold, based on past estimated values around X(n), introduces a hysteresis effect that makes reversals in timing phase adjustments very unlikely. However, the estimated sample values around X(n) are reconstructed from the signal sample values, Y(n), and are therefore subject to error. Errors in the estimated sample values further increase the necessary length of the acquisition preamble. In order to minimize the initial phase error between the sampling clock and the preamble, ZPR has been used with conventional timing control circuits.
Upon obtaining an initial input-signal-to-clock phase difference, the VCO is stopped in order to adjust for any phase difference. The ZPR method applies a controlled phase delay within the timing control circuit, permitting a xe2x80x9crestartxe2x80x9d of the read channel in phase alignment with the incoming signal. A ZPR circuit for timing acquisition in a PRML recording channel is described in Dolivo et al., xe2x80x9cFast Timing Recovery for Partial-Response Signaling Systemsxe2x80x9d, Proc. of ICC ""89 (IEEE), Jun. 11-14, 1989.
In conventional designs, a PLL circuit controls the timing recovery in PRML recordings. A phase detector processes the signal samples to generate a phase error,  between the actual and desired frequency. Compensation for this phase error is used to adjust the sampling frequency, e.g., the output of a VCO, the phase error compensation being the input. The output of the VCO controls the sampling instances of an ADC. It is necessary to first lock the PLL to a reference frequency so that the required sampling frequency can be acquired. Phase lock occurs when a preamble appears under the readback head of the disk drive. A longer acquisition time requires a longer preamble, thus, reducing space available for user data. One technique for locking the PLL to a reference frequency injects into the ADC a sinusoid of one fourth the nominal sampling frequency.
The PLL must be switched synchronously between the clock signal and the input signal so that additional phase corrections are not needed. Also, the PLL must not be referenced to the clock after the VCO is re-started. Should this occur, disruptive phase corrections from the clock interfere with phase locking.
Conventional ZPR designs are susceptible to noise on both the clocking and the input signal. Noise on the input signal contributes to inaccurate phase measurement, leading to inaccurate phase correction that may increase actual acquisition time. One type of noise is termed xe2x80x9cpulse pairingxe2x80x9d noise. Pulse pairing causes adjacent pulses to have alternating, i.e., early and late, phase errors. Conventional ZPR designs do not detect pulse pairing since they rely on a single initial measurement.
Other designs include a pulse position detector to detect the phase difference between data pulses and clock pulses, and an averaging circuit to determine the average amount of phase difference over several pulses. This average value is then used to stop the VCO to correct for this average phase error. Preferably, averaging takes place over an even number of pulses, thereby diminishing the effect of pulse pairing noise.
Implementations of EPRML channels have been documented by R. Wood, xe2x80x9cTurbo-PRML: A Compromise EPRML Detector,xe2x80x9d IEEE Trans. on Magnetics, vol. 29, no. 6, pp. 4018-4020, November 1993, herein incorporated by reference, and E. Eleftheriou and W. Hirt, xe2x80x9cImproving Performance of PRML/EPRML through Noise Prediction,xe2x80x9d INTERMAG 96, Seattle, Wash., April 1996, also herein incorporated by reference. These implementations universally require a signal processing block after the conventional PR4 Viterbi detector in order to optimize EPRML performance without modifying the timing recovery circuit.
Correct gain control is important because decisions about data, samples and timing all assume that system amplification is correct. For example, for PR4, the +1 and xe2x88x921 levels and for EPR4, the +2, +1, xe2x88x921 and xe2x88x922 levels, should be known a priori. Therefore, conventional mass data storage devices use a variable gain amplifier (VGA) to maintain a constant sampled signal amplitude. Controlling a VGA to maintain a constant amplitude is also known as automatic gain control (AGC). Implementing AGC during analog acquisition, the VGA sends the amplified sine wave to an equalizer, e.g., a digital FIR filter, to output an equalized signal. A feedback loop controls VGA operation during analog acquisition. Implementing AGC during digital tracking, the equalized signal is digitized in an ADC and forwarded to a detector where it is fed back to the VGA to adjust gain.
AGC functions for PRML read channels have been proposed that update the VGA gain by using the gradient derived from the following relationships:
en=ynxe2x88x92xnxe2x80x83xe2x80x83(4)
gain(en)=enxc3x97xnxe2x80x83xe2x80x83(5)
where:                                           y            n                    =                      the            ⁢                          xe2x80x83                        ⁢            sampled            ⁢                          xe2x80x83                        ⁢            value            ⁢                          xe2x80x83                        ⁢            at            ⁢                          xe2x80x83                        ⁢            n                          ,                                          x          n                =                  the          ⁢                      xe2x80x83                    ⁢          ideal          ⁢                      xe2x80x83                    ⁢          sample          ⁢                      xe2x80x83                    ⁢          value          ⁢                      xe2x80x83                    ⁢          that          ⁢                      xe2x80x83                    ⁢          is          ⁢                      xe2x80x83                    ⁢          closest          ⁢                      xe2x80x83                    ⁢          to          ⁢                      xe2x80x83                    ⁢                      y            n                                                            e          n                =                  the          ⁢                      xe2x80x83                    ⁢          decision          ⁢                      xe2x80x83                    ⁢                      error            .                              
Note that xn is either +1, xe2x88x921 or 0, for PR4.
Once the gain error is obtained, the AGC loop adjusts the gain of the VGA. The operation of the AGC is divided into two stages, acquisition and tracking.
During acquisition, a sinusoid with a period equal to 4T is used to provide the signal amplitude reference. Either continuous-time or discrete-time methods can be used to implement the acquisition mode. See R. Cideciyan, F. Dolivo, R. Hermann, W. Hirt and W. Schott, xe2x80x9cA PRML System for Digital Magnetic Recording,xe2x80x9d IEEE Journal on Selected Areas in Communications, Vol. 10, No.1, pp. 38-56, January 1992 and R. Yamasaki, T-W. Pan, M. Palmer and D. Browning, xe2x80x9cA 72 Mb/s PRML Disk-Drive Channel Chip with an Analog Sampled-Data Signal Processor,xe2x80x9d Proc. of IEEE ISSCC, San Francisco, 1994, pp. 278-279 incorporated herein by reference.
In the continuous-time implementation, a peak detector is used to derive the amplitude error. After acquisition, assuming that the gain of the VGA is appropriate, the bandwidth of the AGC is reduced for the tracking mode, thus reducing noise sensitivity.
EPR4 relaxes the need for equalization as compared with PR4 (for higher channel densities). Having a lower high frequency SNR than PR4enables EPR4 operation at higher linear densities. However, this comes at a price of added complexity, in turn, leading to longer processing times and lower throughput rates.
A further consideration is the non-linear characteristic of the timing gradient circuit when tracking random user data. Because the method for calculating the timing gradient is based on approximating the slope of the pulses, the gain of the timing gradient circuit varies when tracking random user data because of changing pulse slopes. This variation in gain induces sub-optimum timing recovery.
Therefore, what is needed is to provide a method and system for optimizing the restart associated with a timing recovery process. Further, the system should be capable of optimizing the starting phase for timing acquisition in multiple operating modes and should be transparent to a user.
A preferred embodiment of the present invention uses an architecture having one or more pairs of timing gradient (TG) circuits with center of zero-crossing operating points of each circuit of a pair located at opposing sampling instances, i.e., orthogonal to each other. For example, the center timing instances for a Partial Response Class 4 (PR4) detector are at       π    4    ,            3      ⁢      π        4    ,            5      ⁢      π        4    ,            7      ⁢      π        4  
along the ideal sinusoidal waveform provided as a preamble, while the timing instances for EPR4 are at   0  ,      π    2    ,  π  ,            3      ⁢      π        2    ,      and    ⁢          xe2x80x83        ⁢    2    ⁢          π      .      
Further, neither circuit of the pair is required to exhibit a hysteresis property, thus reducing latency and processing complexity.
Before the initiation of a ZPR operation, each circuit of the installed pair(s) of timing gradient (TG) circuits is normalized if the two TG slopes of each pair are not the same and to allow for use of the power of two modulo arithmetics. This insures the same values for transfer characteristics. The circuits are then activated to calculate phase errors within each of their respective timing sampling instances. At the moment of ZPR activation, the one circuit of each pair of circuits that is closest to zero (i.e., gives the better quality of the phase error estimate) phase error is selected via a comparator circuit. (If more than one pair of TG circuits is used, then that circuit of all of the included circuits with the lowest error value is selected.). Since the initial phase error distribution is uniform, either of the pair (or any of the circuits of multiple pairs) has an equal chance of being chosen. In the case where a non-native TG is closer to the desired timing sampling instances, a separate signal is generated indicating that a phase shift (e.g., the equivalent of 180xc2x0 for one pair of TG circuits) should be added to the resultant phase error values. This equivalent of 180xc2x0 addition could be cyclically added either in the phase detector (internally) or in the timing circuit (externally). After initial ZPR operation, the resultant phase error should approach zero, thus xe2x80x9cforcingxe2x80x9d selection of the native TG thereafter. Note that FIG. 8a provides angular measurements of a sinusoid (preamble) and FIGS. 8b and 8c provide angular measurements of a bit.
An additional benefit of using at least one pair of TG circuits is that the resulting phase error transfer curve now has at least two extremely accurate operational points that are also signal amplitude independent. This results in any maximum incurred error being at least half what it would have been using only a single TG circuit.
Some of the salient advantages of the present invention are that it:
imposes no requirement for memory (e.g., a register) at each stage.
reduces latency.
reduces net overhead.
is ideally suited to use in those systems calling for dual operating schemes employing two different decoding architectures
performs calculations in acquisition stage when there is more time available and then flips the result to the tracking stage.
does not require hysteresis effects be introduced.
accurately estimates total phase shift while incurring a minimum hardware burden.
reduces error propagation.
uses less complicated computational architecture.
includes at least two accurate points for operation on the error transfer curve.
can be made extremely accurate by simply adding circuits to operate, as needed, in pairs.
Thus, it is a general advantage of the present invention to improve the timing recovery method in synchronous partial response magnetic recording systems.
Further, an advantage of the present invention is to provide a ZPR optimization system that determines an optimal starting phase for a timing control circuit oscillator, thereby minimizing clock recovery time.
A more specific advantage of the present invention is to provide a ZPR that uses a minimum amount of hardware and subsequent silicon area on a chip.
A further advantage of the present invention is that it provides a ZPR optimization system and control algorithm that determines the optimal restart phase for an oscillator clocking signal provided to an ADC for sampling an incoming analog signal.
A further advantage of phase optimization is minimization of the frequency transient, since frequency may be derived by integration of the phase error,  Accordingly, smaller phase errors,  result in lower frequency errors and shortened acquisition of the preamble.
In sum, the invention can achieve better noise performance by finding an average phase difference rather than an instantaneous one.
A preferred method used for the starting phase selection in a timing recovery process involves receiving a xe2x80x9cknownxe2x80x9d (to an accurate degree) frequency sampled signal at two similar TG circuits. The received sampled signal is sent in parallel to each TG circuit and a comparator, e.g., minimum of absolute value, compares the error values produced from each TG circuit. An adjusted starting phase is selected, based on a signal representing the lesser absolute value of the two error values. The timing recovery circuit is coasting during this period until the user wants to use it. The timing gradient closest to zero is then either latched or averaged over a few cycles until ZPR is initiated.