The MPEG standard is a standard for the representation of compressed and coded audio and video data for allowing a data exchange between compatible terminals and offering normalized decoding methodologies. The standard provides for an organization of the compressed and coded data that is oriented to the transmission of packets. The organization is hierarchical, whereby a higher level (system layer) entails the transmission of a sequence of the so called audio-visual "packs", starting with a pack start code and a pack end code; the sequence ends with transmission of a sequence end code (ISO11172 end code). An immediately lower level (pack layer) determines the organization of the packs and prescribes that each of them comprises, after the start code, timing information, the so-called system header and a number of audio and video packets for one or more channels; each packet comprises a header with service information, and the actual data. When decoding takes place, the different types of packets present in a pack are demultiplexed and then decoded separately, by exploiting the service information present in the packs (start code, synchronization information and system header) and in the packet headers.
In the case of audio signals, which is the one of interest for the present invention, the data inserted into the packets are organized into audio frames comprising a fixed number of samples. Coding is a sub-band coding, the bit allocation to the different sub-bands being determined on the basis of suitable human perception models. During the decoding phase, in addition to recovering the original audio signal, it is also necessary to solve the problem of synchronization with pictures belonging to the same transmission. The problem is made particularly arduous by the fact that, according to the standard, audio data can be sampled at a certain number of rates, in particular 32 kHz, 44.1 kHz and 48 kHz, and the 44.1 kHz rate has no practically usable multiple in common with the other two rates.
A commercially available MPEG audio decoder directly generates the clock signal corresponding to the sampling rates of 32 and 48 kHz and obtains, from the latter, a second clock signal, related to the 44.1 kHz sampling rate, through the use of an accumulator which loads a fractional, user-programmable value, at each end-of-count of the counter generating said clock signal and which adds 1 to the count in progress when the accumulated value is more than one. This solution is not satisfactory because the correction is very abrupt and it cannot be tolerated by the output digital-to-analog converter, especially if the latter is of high quality. Moreover, the known device does not include any means for recovering possible phase shifts between the timing indications associated with the data stream (based on the clock signals generated by the encoder) and the clock signal generated by the decoder.
According to the invention, an audio decoder is provided instead wherein the correction of the second clock signal, too, is managed directly by the decoder, with no need to use external devices, and is performed in a smooth manner, and wherein, moreover, means are provided to recover any possible phase shift between the timing indications associated with the data stream and the clock signals generated by the decoder.
A decoder for audio signals belonging to audio-visual streams digitally coded in accordance with standard ISO/IEC 11172, such audio signals being inserted into packets comprising a packet header with a first group of service words, and data words composed of audio signal samples inserted into frames comprising a pre-set number of audio samples and a frame header with a second group of service words. The decorder (DA) comprising:
interface means (IS) for receiving audio packets and programming and synchronization information from external units (DS, CN), which manage the system layer of the standard;
a parser (AS) of the audio packets, which receives the packets from the interface means (IS), recognizes the correctness of the configuration and of the sequence of the service words in the first group, and forwards the data contained in the packets to subsequent units when a presentation time stamp (PTS) for those data is recognized in the first group of service words;
means (DFA) for decoding the audio stream, which receive from the parser (AS) the content of the data words of the packets and decode it by exploiting the service words in the second group;
means (RS) for searching and checking the audio data synchronism, on the basis of information supplied by the parser (AS) and by the means (DFA) for decoding the audio stream; and
a presentation unit (UP) for supplying the decoded data to digital-to-analog conversion means, data presentation being possible with difference sampling rates which can be derived from at least a first and a second master frequency, the first master frequency being also utilized to generate an internal clock signal (CLK24) for the components of the decoder (DA).
The decoder (DA) can further comprise means (SAV) managing audio-video synchronization, which are arranged to:
start the presentation of the audio signals, by comparing a first timing signal (SCR), supplied by the interface means (IS) and representative of a system clock which also times depending and presentation of the video signals, and a second timing signal (PTS), taken from the stream of audio samples and consisting of same presentation time stamp, and
generate, independently, a first or a second clock signal (CLK24, CLK22) for the correct presentation of the audio signals with a sampling rate derived from the first or respectively from the second master frequency, and control these clock signals by using a feedback circuit which comprises a digital filter (FN) and operates in such a way as to minimize the difference between the first timing signal (SCR) and the second one (PTS), the first clock signal for the presentation of the audio signals coinciding with the internal clock signal of the device.
The means (SAV) managing audiovideo synchronism can comprise:
means (ST1) for carrying out the comparison between the first and the second timing signals (SCR, PTS) and for providing a signal (DIFF) representative of the difference between said signals;
the digital filter (FN), which is a low-pass filter whose poles, zeros and gain can be programmed through the interface means (IS) and which is arranged to filter the difference signal (DIFF) supplied by the comparison means (ST1), if the value of this signal is within a pre-set interval, and to supply an error signal, when enabled by the data presentation unit (UP); and
a first and a second phase locked loop, comprising respectively a first and a second voltage-controlled oscillator (VCO1, VCO2), which are controlled by said error signal through respective digital-to-analog converters (DAC1, DAC2) and are arranged to generate and send to the presentation unit (UP), respectively the first or the second clock signal (CLK24, CLK22) for data presentation, depending on the required sampling rate.
In an initialization phase of the decoder (DA), the filter (FN) provides the converters (DAC1, DAC2) with an error signal corresponding to the central value of the pre-set interval.
The presentation unit (UP) can comprise:
a data presentation register (RPD) for the serial emission of the decoded samples on a decoder output;
a first logic network (LC3) for controlling sample loading and emission by said register (RPD) and for generating synchronism signals (BCLK) for sample reading by utilization devices; and
a second logic network (LC4) which generates and supplies to the first logic (LC3), on the basis of information on a data oversampling factor contained in the second group of service words, signals (LD, SHIFT, TWS) controlling data loading and shifting and the switching of the presentation channel, this second logic network (LC4) deriving said signals by processing the output signal of a counter (DCNT) whose counting capacity is equal to the capacity of said register (RP) multiplied by a maximum value of the oversampling factor.