Data reliability is a major concern in hard disk drive systems. Increased storage capacity requires a continuous evolution of data protection techniques, which are generally regarded in any new system generation as enhanced error correction coding schemes.
User data is normally protected with redundant information to ensure data integrity through time regardless of noisy and defective media support, mechanical shocks and system aging. Each user data item is conveniently distributed across a set of independent frames of regular format, called sectors, with each one section storing up to 512 user eight-bit bytes. The distribution in sectors is generally handled by the system operating the physical drive.
Apart from the need for error correction redundancy, data needs to be appended to a fixed format header. This ensures adequate signal amplitude and synchronization, just to mention two of the various operations that are indispensable for a proper synchronous detection systems.
In the following, timing recovery will be distinguished from frame synchronization. Timing recovery denotes the process through which the optimal sampling phase and frequency are achieved. Frame synchronization denotes the process used to identify the starting position of the payload (data) field within the frame. Since frame synchronization relies on the synchronous detection of a known pattern, referred to as a sync mark, it is apparent that it cannot happen unless timing has been recovered.
The general structure of a data sector is shown in FIG. 1. The illustrated field sizes do not conform to their relative size. The four distinct fields are 4T preamble, synch mark, data and pad.
The 4T preamble field is a known magnetization pattern: 1100—with 1 and 0 denoting the two elementary tiles, equal in size and of opposite magnetization used to record any data pattern—is repeated several times and with the same phase. This field is used to acquire phase and frequency lock, and to recover proper signal amplitude.
The sync mark field is a pattern known a priori by the system and is generally not sector specific. It is written immediately after the preamble to mark the onset of the data field.
The data stores the sector payload, and is generally protected by the error correction code.
The pad field is an appendix generally used for data flush through the signal processing pipeline and inter-sector separation.
With every new HDD system evolution, signal processing needs to evolve to compensate signal to noise ratio reduction due to increased storage and faster data access demand. From a signal processing standpoint, data reliability is strengthened by evolving the error correction code (ECC) properties of the data field. ECC will be referred to throughout the rest of the description as a general coding and detection strategy without specifying further details.
This approach assumes, however, that the header recovery failure does not jeopardize the system performance even in the noisier scenario. In general, it is assumed that the header can be lengthened for an increased robustness, but even though this is generally true, it bears a data format penalty which should be considered with the same importance of the ECC redundancy budget.
To better analyze data irrecoverably, the main problems experienced by the HDD industry will be taken into account and for which an approach has been provided to successfully recover a sector:
1) a synchronous lock to the read-back signal needs to be reliably achieved, and kept;
2) if lock is achieved, at least over the sector onset, sync mark detection can succeed; and
3) with lock maintained across the entire frame, and the sync mark is correctly identified, data is recovered up to the ECC recovery capability.
The above items are summarized using the following formulas:P(no recovery)=P(lock lost)+P(sync lost, locked) +. . . P(ECC overwhelmed, (locked and synchronized))using conditional probabilities, and exploiting previous observations 1-3:P(no recovery)=P(lock lost)+. . . P(sync lost/locked)*{1−P(lock lost)}+. . . +P(ECC overwhelmed/synchronized)*P(synchronized/locked)* . . . *{1−P(lock lost)}and finally:P(no recovery)=P(lock lost)+. . . +P(sync lost/locked)*{1−P(lock lost)}+. . . =. . .+P(ECC overwhelmed/synchronized)*{1−P(sync lost)}*{1−P(lock lost)}which simplifies to:P(no recovery)=P(lock lost)+P(sync lost/locked)+. . . P(ECC overwhelmed/locked and synchronized)=PLOL+PSYNC+PECC  [1]provided that each one of the three terms is well under unity. This requirement is easily satisfied in their applications. From this analysis it is apparent that any coding and detection breakthrough that improves PECC is practically useless unless both PLOL and PSYNC are not improved as well.
Referring now to the first two terms PLOL and PSYNC of the equation [1], each one will now be discussed separately. Timing synchronization is achieved through a digital second order phase lock loop. The dilemma is to balance properly the 4T preamble field length with an additional hardware complexity investment. Given the system trend of fading SNR due to the availability of increasingly efficient ECC protection, more complex timing lock algorithms are required. However, this is without degrading the hardware speed performance.
Any addition to the timing gradient estimation complexity faces almost unavoidably an increment of loop latency, which bears severe consequences on system performance. The latency increase can only partially compensate altering the open loop PLL response.
As shown for instance in the reference “Effect of Loop Delay on Stability of Discrete-Time PLL” (J. W. M. Bergmans, IEEE Trans. Circuit and Systems, vol. 42, no. 4, April 1995) the acquisition speed degrades severely with additional latency in PLLs, thus reducing practical advantages on the frame format of improved algorithms.
Open loop approaches, such as phase restart techniques, are used for precise phase estimation, but not for frequency mismatch tracking. For instance, to reliably estimate a frequency offset of 0.1%, around 103 samples are needed. Assuming a 10% ECC redundancy, there are 512×8×1.1=4506 samples for the data field. A practical goal for the entire header is to be around 5% of the data section, which is approximately 226 samples. This is less than 25% of the 103 figure.
For the second PSYNC term in equation [1], a known approach is to modify the FIG. 1 format as shown in FIG. 2.
As previously mentioned, the illustrated field sizes do not conform to their relative size. A practical ratio between the two data file sizes is length(data1)/length(data2) which is approximately 20/390, i.e., about 0.05. The two sync fields are generally of comparable length.
There are two sync mark fields, separated by a data section. The length of each sync mark field is approximately the same as the sync mark field of FIG. 1.
Though lengthening a sync mark field is generally possible to exceed comfortable target specifications for PSYNC in normal noisy conditions with patterns shorter than 30 bits, and it is fair to say that denoting with sync0 the sync pattern used in a FIG. 1 format such as:length(sync0)=length(sync1)+length(sync2),it is possible to match at least the frame synchronization performance in the FIG. 2 scheme.
Generally, a catastrophic loss of the sync mark is due to an undetected media defect localized over the sync mark area. This event can have a likelihood PDefect greater than PSYNC. By splitting the sync mark field using a chunk of data, this ensures that only extremely long and extremely unlikely media defects can simultaneously destroy both synchronization features. In this case, as long as PDefect>PSYNC>(PDefect)2, the scheme is effective.
In general, PDefect is a function of the defect length. A ‘defect scan’ performed over each disk surface at least at manufacturing time ensures that defective disk areas are not used for data storage. The longer the defect, generally the more effective is its reliable location. In fact, a long defect will yield the largest energy fluctuation with respect to a correctly magnetized media.
Distance between sync1 and sync2 patterns depends on which is the longest undetected defect that can be assumed as rare as PSYNC and on maximum ECC recovery capability, given that in the case of a sync1 loss the data chunk data1 has to be totally inferred from redundancy over the data2 field. Using just this second argument, we could assume a maximum separation of around 20*8=400 bits between the two. There is no preamble section dedicated to the sync2 field. This is to reduce format penalization.
Still using the defect arguments, it is possible, however, that the defect that caused the loss of the first sync mark pattern in the frame also eroded the last preamble section, thus weakening the timing acquisition. In this case we cannot say that the second sync mark pattern is unaffected by the defect over the first one, as it cannot be guaranteed a correct synchronization over this section which is yet in the first section of the frame.
Furthermore, burdening the ECC system with the task of recovering from scratch the data section data1 prevents optimal ECC protection allocation along the rest of the payload field data2. Moreover, current loop architectures cannot cope with increasing data rates without degrading performance. For instance, a known approach disclosed in European Patent Application No. 0898373 provides an improvement margin that is not substantial, despite a long research effort.