On Mar. 29, 1984, the FCC Report and Order in Docket No. 21323 adopted and authorized a standard for multichannel television sound (MTS). Bulletin No. 60 of the Office of Science and Technology, "Multichannel Television Sound Transmission and Audio Processing Requirement for the BTSC System" (OST 60) contains the technical specifications for the above mentioned MTS adopted by the FCC.
The MTS system, known variously as MTS, BTSC stereo, or broadcast TV stereo, consists of two parts: an overall transmission system, and a noise reduction, or companding system. The "BTSC System Multichannel Television Sound Recommended Practices" (BTSC Standard) defines companding as: "a noise reduction process used in the stereophonic subchannel and in the second audio program subchannel consisting of compression (encoding) before transmission and complementary expansion (decoding) after reception. (This definition conforms to "companding" in OST 60 Section A)."
Without companding, the transmission system is capable of delivering high-quality 5 stereophonic audio. However, FM-transmission systems experience squared relationship between noise and frequency (Parabolic Noise Characteristic) resulting in higher noise at the higher frequencies. In addition, to avoid causing interference, the BTSC Standard limits the amount of modulation that can be applied to the stereophonic signal. Thus, even under ideal conditions, the addition of the subcarrier adds approximately 15 dB of noise to stereophonic (stereo) reception compared to monophonic (mono) reception. To make matters worse, under certain impaired transmission/reception conditions, such as a weak received signal, transmitter ICPM and multipath effects, buzz or hum can be introduced onto the transmitted audio. Thus, without companding, the service area for stereophonic TV (stereo-TV) reception is smaller than for monophonic TV (mono-TV) reception.
The situation is worse for the Second Audio Program (SAP) channel. The SAP subcarrier frequency is 78.7 KHz, much higher than for the stereo signal; therefore, the Parabolic Noise Characteristic results in even more noise in the signal. Furthermore, this subcarrier uses FM, which is additionally subject to picture-to-audio intermodulation (buzz beat), causing a particularly annoying distortion.
The BTSC noise-reduction system was designed to be a cost-effective aid to the MTS transmission system in delivering a clean, noise-free audio signal into the home. Specifically, the system was designed to: provide significant noise reduction even in poor reception areas while preserving input-signal dynamic range; prevent the stereo-subcarrier from interfering with overall transmitted power levels (AM-interleave effects); ensure reliable performance even in the face of manmade noise and transmission/reception-system impairments; and to provide these benefits at a commercially reasonable cost.
To achieve the above-stated goals, the BTSC Standard departed from previous design approaches where the dynamic range of the impaired channel was very low. In the case of the stereo-subcarrier in Grade B reception, the available dynamic range was about 43 dB, while in the SAP channel the dynamic range was about 26 dB. These compare unfavorably with the typical dynamic range of a compact cassette at about 60 dB. The significance of these figures, the operation of prior art compander systems, and the operation of the current invention will be better understood after a discussion of the psychoacoustic phenomenon of masking.
All audio noise-reduction systems work on the principle of masking; a listener will be oblivious to the noise on a transmission when the program signal, music or speech, is loud enough and its spectral content is broad enough, to mask the noise.
For example, if the program consists of low-frequency sounds, it must be transmitted at a high level relative to the background noise of the stereo-subcarrier channel to capture the listener's attention and for the listener to be unaware of the background noise. On the other hand, a broadband signal does not need to be much higher in amplitude than the background noise for the noise to fade below the listener's perception threshold. See, for example, I. M. Young, and C. H. Wenner, Masking of White Noise by Pure Tone, Frequency-Modulated Tone, and Narrow-Band Noise, J. Acoust. Soc. Am. 41, pp. 700-705, 1967.
The noise reduction system must compress and encode the audio signal such that it will consistently mask the channel noise during transmission, and then decode and expand the transmitted signal to recover the original audio signal. In passing through the encode/decode (companding) cycle, distortion or other degradation of the audio signal must be kept to a minimum. And in the decoding process, all the audible noise should be eliminated. Thus, not only must the level of the transmitted audio be high relative to the background noise, but the frequency of the signal and noise must be considered when selecting or designing an effective noise masking and companding scheme. Other characteristics of the signal and noise will affect the design of an ideal companding system. Thus, the amplitude of the signal, the rate at which the signal changes amplitude, and even whether the signal is decreasing or increasing are also important parameters in the design of an ideal companding system.
The stereo-subcarrier's background-noise spectrum is white, rising at 3 dB-per-octave. By comparison, the SAP subcarrier noise rises at 9 dB-per-octave. The noise can be masked if the transmitted signal's spectrum contains substantial high-frequency, especially in the case of the SAP channel. If so, the compander would only have to keep the signal amplitude levels high through the transmission medium. However, most TV program materials have their dominant energy at low frequencies. Alternatively, if the program consistently lacked high-frequency content, one could simply apply a constant rising preemphasis characteristic to the entire audio spectrum.
However, today's TV program content, especially music and movie effects, is neither consistently high frequency nor low frequency, for either approach to work. Inevitably, the signal will have instances of high-level, high-frequency energy in the audio signal, where a constant rising preemphasis would cause headroom (overload distortion) problems.
Existing solutions to the problem of preserving psychoacoustic masking and simultaneously preserving headroom at all frequencies use spectral companding, a preemphasis scheme which adapts its characteristic to suit the signal. The spectral compressor in the encoder measures the spectral balance of the input signal and varies the high-frequency-preemphasis accordingly, merely increasing the potential for masking, and reducing the possibility of high-frequency overload. The resulting encoded signal is, therefore, dynamically adjusted to consistently contain a substantial proportion of high frequencies before transmission, thereby masking the channel noise.
During reception, the spectral expander (in the decoder) restores the high frequencies to their proper amplitude. If the original input signal contains predominantly low frequencies, the decoder attenuates the high-frequency background noise, leaving the low-frequency signal and low-frequency background noise, the latter of which is masked by the signal itself. If the original input signal contains predominantly high frequencies, the decoder does not attenuate the high frequencies to restore correct frequency response, since the signal itself masks the noise.
The spectral compressor achieves two simultaneous requirements: the system is forgiving of high-background-noise environments because the spectral shaping of the input signal is adjusted according to the needs of the input signal to provide high masking at all times, and headroom is maintained throughout the frequency range because extreme preemphasis is used only when it is really needed.
While the spectral compressor continuously adapts its characteristics to the specific nature of the program material, typically a significant overall preemphasis is needed during encoding to provide adequate masking. Instead of designing the spectral compressor to operate only as a preemphasis network, the BTSC system lets the spectral compressor, and hence the spectral expander, operate with a symmetric boost and cut, while obtaining the bias toward preemphasis from fixed preemphasis networks.
The spectral compander works in conjunction with the wideband compressor, thus the 2:1 wideband compressor senses the input signal level and adjusts a variable-gain stage to reduce the amplitude of large input signals and boost the amplitude of small input signals. This way, the amplitude dynamic range at the output of the wideband compressor is half that at the input
The wideband compressor produces an output signal of relatively constant level. The choice of this level affects how well the noise is masked, higher levels being better, and how cleanly transients are reproduced, lower levels being better. The BTSC Standard has set this level at about 17 dB below 100% modulation at 300 Hz. This allows room for transients to overshoot in the compressor without causing excessive distortion through a complete companding cycle and ensures the program signal has a sufficiently high amplitude to mask the background noise.
On decode, the wideband expander restores the original program dynamics. During quiet passages, the expander attenuates the background noise, while during passages of significant signal amplitude level, the program itself masks the noise.
Another important concern of the BTSC design is to protect against large transients causing excessive modulation of the transmission system. In a peak-limited medium, such as the MTS transmission system, the peak excursion of the compressor output is an important parameter to control. This requirement can most easily be met by a clipper or limiter.
Clippers are relatively inaudible in operation provided that the clipper clips only transients that last under a few milliseconds. To accomplish this the unaffected-level point of the wideband compressor is set below 100% modulation point (Unaffected Level); and the wideband and spectral compressors, which precede the clipper, are set fast enough to allow only brief overloads to reach the clipper. Furthermore, the static preemphasis precedes the clipper, further reducing the transient overload duration. Aiming at a signal level for the output of the encoder that is substantially below 100% modulation, provides room for the typical peak excursions that occur when normal musical and other program transients are input to the compressor.
An important benefit of the Unaffected Level alignment is that the signal level broadcast over the stereo-subcarrier has its amplitude distribution between 10% to 30% modulation. This allows for large amplitudes in the monophonic carrier without exceeding the allowable modulation limits of monophonic plus stereophonic carriers. Since the stereo-subcarrier tends to average below 30% modulation, the mono carrier is allowed to stay around 70% modulation.
To protect the 15.734 KHz pilot signal and to prevent any spillover of signals into the sum channel, the difference channel must not contain information above about 15 KHz. This is accomplished by a lowpass filter in the encoder. For example, an 11-pole Cauer filter has been used for such application. This filtering process causes a phase shift in the difference channel (L-R signal) which must be compensated for in the sum (L+R)channel. This lowpass filter is located at the encoder output, but within the feedback loop, so that the RMS-level detectors in both the compressor and expander sense the same bandlimited signal. This is a limitation in prior art BTSC encoding, as it often requires high order filters and, as discussed below, often requires the use of matching filter components.
The ideal low-pass filter should pass the frequencies from 30 Hz to 15 KHz with minimal attenuation, have peak attenuation at 15.734 KHz, and maintain substantial attenuation above 15.734 KHz. Specific requirements for this filter can be found in the EIA document "BTSC System Television Multichannel Sound Recommended Practices". Note that the phase shift introduced by this filtering must be compensated for by an identical Phase shift in the sum (L+R) channel, or stereo separation will be significantly degraded. This is usually accomplished by using identical, matched filters in the L-R and L+R channels. This is a limitation in prior art BTSC encoding, since filter matching increases manufacturing costs, and errors in filter matching degrade performance.
The received demodulated difference (L-R) signal has high-frequency components caused by the other audio channels in the system, which could interfere with proper decoding if they are not attenuated before reaching the expander detectors. Therefore, filtering is necessary to prevent decoder mistracking. Pilot cancellation techniques will reduce the amount of 15.734 KHz in the L-R signal, and therefore reduce the required attenuation of the filter. There are many possible circuit variations which call for different alignments for this filter. For example, the filter may be located in the L-R signal path. However, this requires a compensating network, which in turn usually requires a matching filter to be located in the L+R signal path to maintain separation. Again, this exemplifies a limitation in prior art BTSC encoding, by requiring filter matching and additional hardware components.
OST 60 describes a theoretically perfect compressor. Due to bandwidth limitations of presently available components, it may not be practical to conform perfectly to the ideal design of OST 60. This means that small deviations from perfect amplitude and phase response may exist in practical encoders, especially at the edges of the audio band. Such deviations may be compensated for by amplitude and phase errors introduced in the sum channel. Note that any deviations from perfect amplitude response in the encoder will be exaggerated by the expansion action of the decoder. For this reason, somewhat more amplitude response error must be placed in the sum channel than in the difference channel if the sum-channel response is to match that of the difference channel, including an encode/decode cycle. The amount of exaggeration of the error varies with frequency because the effective compression ratio varies with frequency. However, encoder phase errors should be compensated for with identical errors in the sum channel, since the decoder will not exaggerate phase errors.
The above discussion illustrates a significant limitation in prior art BTSC encoding. The hardware components introduce amplitude and phase errors, which errors, if not properly corrected or compensated for, will be magnified upon decoding; thus necessitating a frequency dependent overcompensation of the error. Furthermore, in prior art practice, it is an already accepted fact that due to bandwidth limitations of presently available components, it is not practical to conform perfectly to the ideal design of OST 60. Being unable to achieve the ideal requirements, the industry deviates from the ideal design during the signal encoding and compression. The signal decoding and decompression does not always compensate for such deviations, resulting in less than ideal noise reduction techniques.
The sum-channel compensation at the transmission end must correct only for phase and amplitude errors introduced by the encoder, while errors introduced by the decoder must be corrected only at the reception end. This allows freedom for future improvements in the state of the art. Furthermore, by compensating for transmission errors only in the transmitter, receiver manufacturers are free to build more nearly ideal receivers, rather than being forced to build errors into the receiver which compensate for errors in the transmitter. Also, by having the receiver end and the transmitter end compensate for their errors and limitations, this maintains receiver and transmitter independence and it does not impose a burden on either end to correct or compensate for errors or limitations of the other end.
As it is inherently typical with the adoption of a standard, the standard constrains the users to work and use that standard alone, thereby often stagnating the technological growth in the relevant field, or otherwise constraining the standard users and limiting the use of technology that may become superior by virtue of improvements in technology, or reductions in manufacturing costs. In the case of stereo-TV, because of compatibility concerns, a new and modern television set must be compatible with a 10 or 20 year old standard, effectively denying a consumer the enjoyment that could otherwise be available by improvements in technology.
Prior art BTSC systems are effective in compressing and expanding the signal upon reception. However, prior art systems often are not very sensitive and responsive to transients where the signal amplitude quickly increases or decreases. Specifically, many expanders are unable to follow rapid envelope changes without producing undesirable acoustic effects, such as pumping and breathing. Many expanders try to overcome the above mentioned problems by slowing down its response to sudden amplitude changes. This, however, results in a response that is too slow to follow fast changing musical envelopes. Attempts to increase the response to said sudden amplitude changes often results in increased low frequency distortion and the above-mentioned pumping and breathing.
One existing system attempts to cure the above mentioned problems by varying the amplitude of the control signal as a function of the time derivative of the control signal. To achieve this, one system employs a lead-lag circuit including a diode. The diode is forward biased, resulting in fast reaction time to positive changing signals. However, the proportional derivative and lead-lag circuits are bi-polar and are unable to distinguish between positively increasing and negatively increasing signals. As a result, the response to negatively changing signals typically result in psychoacoustic distortion. In addition, the release behavior of the gain control module disclosed therein is non-linear which also affects the envelope and results in distortion. Furthermore, the circuit requires the selection of analog components to set a precise time-constant for a particular type of release, which time constant may be satisfactory for some types of programs, but not for others.
Another system attempts to overcome the diode limitation problems and non-linear circuit behavior. However, this is accomplished again via the use of diodes and other non-linear components, thereby requiring more complex circuit systems, extensive compensation and component selection.
Accordingly, there is a need for a means to achieve BTSC encoding and decoding without having to contend with the non-linearity of components, independent of component tolerances and independent of variations in environmental conditions.
Also, there is a need to encode and decode a signal that is fully compatible and in compliance with current Multichannel Television Sound BTSC systems that does not deviate from the ideal low-pass filter recommended in the EIA document "BTSC System Television Multichannel Sound Recommended Practices," that does not introduce undesirable phase shift or distortions, and that does not require the matching of filters or other components.
Also, there is a need for an encoder/decoder system, method and apparatus that exhibits greater signal to noise separation, improved performance characteristics throughout the spectrum of the signal, and improved response time and performance characteristics to sudden changes in the amplitude and/or frequency of the signal.
There is also a need to provide for the above-mentioned improvements at a reduced cost, while providing the service providers with means to adjust the encoder/decoder performance characteristics according to the type of program being transmitted, and preferably being adjustable on a real-time basis according to the instantaneous needs of the program.
There is also a need for an encoding/decoding system that can grow with and change with the changing needs in the market and advances in the technology, and is not bound or limited to perform according to a fixed standard.
There is further a need to provide for the above-mentioned improvements without sacrificing compatibility with current BTSC standards and without sacrificing performance on prior art television sets.
The present invention includes a digital stereo modulator having a digital left channel input and a digital right channel input. These inputs go into a Digital BTSC Stereo Generator that contains a Sine Table. The Digital BTSC Stereo Generator generates and outputs the following signals: 2*F.sub.H, compressed BTSC L-R, preemphasized BTSC L+R, and F.sub.H. A digital multiplier amplitude modulates the 2*F.sub.H carrier with the compressed BTSC L-R output, and the resultant signal is a Stereo Subchannel AM-DSB-SC BTSC Compressed L-R, which is then summed together with the preemphasized BTSC L+R, and F.sub.H by a digital summer. The output of the digital summer is the composite BTSC output without the SAP or Professional channels.
The Digital BTSC Stereo Generator transforms left and right inputs into the outputs described above by utilizing clock inputs 3*F.sub.H and 12*F.sub.H. A Digital PLL phase locks 3*F.sub.H and 12*F.sub.H clocks to a Horizontal Synch (F.sub.H) reference, for input to the Digital BTSC Stereo Generator. The 3*F.sub.H clock represents the sample rate of the digital left and right inputs. This sample rate is interpolated, by a factor of 2 to a rate of 6*F.sub.H, in the Digital BTSC Stereo Generator. The sample rate is additionally interpolated to 12*F.sub.H to allow digital representation of the composite BTSC output which has an upper 3 dB frequency component at 46.468 KHz.
The fact that all clocks in the Digital BTSC Stereo Generator are integer multiples of F.sub.H allows simple and accurate synthesis of the 2*F.sub.H carrier by a sine look up table method. If the clocks are not integer multiples of F.sub.H, due to a fixed left and right sample rate of 44.1 or 48 KHz for example, more complex techniques can be utilized in accordance with the present invention to generate F.sub.H and 2*F.sub.H.
The present invention has advantages over prior art analog implementations. Analog implementations which derive analog F.sub.H and 2*F.sub.H signals often require considerable calibration. In analog implementations, the F.sub.H and 2*F.sub.H signals traverse different filtering paths and invariably end up out of phase. An additional phase adjustment stage is needed to compensate. If expensive matched components are not used, this technique often requires manual calibration during the manufacturing process. The phase relationship between F.sub.H and 2*F.sub.H, is more easily controlled in the digital environment of the present invention. Also, traditional analog mixers are extremely sensitive to input DC and require careful design and calibration to limit carrier energy in a suppressed carrier system. Traditional analog mixers are also extremely sensitive to input levels which, if not carefully adjusted, can cause considerable harmonic distortion at the output. The impact of these analog mixer degradations varies as a function of temperature. The digital multiplier of the present invention is relatively free of these degradations and is relatively immune to temperature variation. Similarly, the digital summer of the present invention is implemented by a digital adder and exhibits similar immunity to environmental effects that affect analog summers.
This and other aspects of the invention will become apparent upon reading the following specification in conjunction with the accompanying drawings.