1. Field of the Invention
The invention relates to a digital transmission system for transmitting a wide-band digital signal having a transmitter and a receiver, the digital transmission system transmitting a wide-band digital audio signal comprising at least a first and a second audio signal component via a transmission medium, the transmitter comprising an encoder having analysis filter means for filtering the signal components so as to obtain a number of n sub-signals for each of the at least two signal components, data reducing means for carrying out a data reduction step on the n sub-signals of each of the at least two signal components, and transmission means for transmitting data-reduced sub-signals via the transmission medium, and the receiver comprising receiving means for receiving the data-reduced sub-signals, data expansion means for carrying out a data expansion step on the received data-reduced sub-signals so as to obtain non-data-reduced replicas of the sub-signals, and a decoder having synthesis filter means for combining the replicas of the sub-signals for each of the at least two signal components so as to obtain a replica of the at least first and second signal components.
The invention further relates to a transmitter and a receiver in the transmission system, and to a record carrier obtained by means of the transmitter being in the form of a device for recording a signal on a record carrier.
2. Description of the Related Art
A transmission system of the type defined in the opening paragraphs is known from the article xe2x80x9cThe Critical Band Coder Digital Encoding of Speech signals based on the Percentual requirements of the Auditory Systemxe2x80x9d by M. E. Krasner, in Proc. IEEE ICASSP 80, Vol. 1, pp.327-331, Apr. 9-11, 1980. This article relates to a transmission system in which the transmitter employs a sub-band coding system and the receiver employs a corresponding sub-band decoding system, but the invention is not limited to such a coding system, as will become apparent hereinafter.
In the system known from this publication, the speech signal band is divided into a plurality of sub-bands having bandwidths approximately corresponding to the bandwidths of the critical bands of the human ear in the respective frequency ranges (cf. FIG. 2 in the article of Krasner). This division has been selected because, on the ground of psycho-acoustic experiments, it is foreseeable that the quantization noise in such a sub-band will be masked to an optimum extent by the signals in this sub-band if, in the quantization, allowance is made for the noise-masking curve of the human ear (this curve giving the threshold value for noise masking in a critical band by a single tone in the center of the critical band, cf. FIG. 3 in the article by Krasner).
It should, however, be noted that the invention is not restricted to an encoding into sub-band signals. It is equally well possible to apply transform coding in the encoder, a transform coding being described in the publication xe2x80x9cLow bit-rate coding of high-quality audio signals. An introduction to the MASCAM systemxe2x80x9d by G. Theile et al., in EBU Technical Review, No. 230 (August 1988).
In the case of a high-quality digital music signal, which, in conformity with the Compact Disc Standard, is represented by 16 bits per signal sample in the case of a sample frequency of 1/T=44.1 kHz, it is found that with a suitably selected bandwidth and a suitably selected quantization for the respective sub-bands, the use of this known sub-band coding system yields quantized output signals of the coder which can be represented by an average number of approximately 2.5 bits per signal sample, the quality of the replica of the music signal not differing perceptibly from that of the original music signal in substantially all passages of substantially all kinds of music signals.
The sub-bands need not necessarily correspond to the bandwidths of the critical bands of the human ear. Alternatively, the sub-bands may have other bandwidths, for example, they may all have the same bandwidth, provided that allowance is made for this in determining the masking threshold.
It is an object of the invention to further improve the transmission characteristics of the transmission system.
This object is achieved in the digital transmission system described in the opening paragraphs, characterized in that the transmitter is further adapted to transmit at least a first auxiliary digital signal component, the transmitter comprising means for generating a composite transmission signal comprising frames, a frame including a first frame portion comprising synchronization information, a second frame portion comprising a block of data of the data-reduced sub-signals and a third frame portion located after the second frame portion in said frame, said third frame portion comprising a block of data of the auxiliary digital signal, and the receiver further comprising derivation means for deriving said at least one auxiliary digital signal component from third frame portions in frames of the composite transmission signal received.
The invention is based on the recognition that data reducing the first and second digital audio signal components results in space becoming available on the transmission medium, this space can then be used for transmitting at least one auxiliary signal component.
The transmission system may be further characterized in that the transmitter comprises control signal generator means for generating a signal type control signal indicating the type of said at least one auxiliary digital signal component, the transmitter further being adapted to transmit said signal type control signal, and the receiver further comprising detection means for detecting said signal type control signal, the derivation means being adapted to derive, in response to said signal type control signal, said at least one auxiliary digital signal component from the signal received. This enables the receiver to retrieve the auxiliary signal component of a specific type from the transmission signal received.
The transmission system may be further characterized in that the transmitter comprises control signal generator means for generating a signal presence control signal indicating the presence or absence of said at last one auxiliary digital signal component, the transmitter further being adapted to transmit said signal presence control signal, and the receiver further comprising detection means for detecting said signal type control signal, the derivation means being adapted to derive, in response to said signal presence control signal, said at last one auxiliary digital signal component from the signal received.
This enables the transmission of packets of auxiliary information in only those frames of the transmission signal that have sufficient space to store these packets in them. The receiver is now capable of identifying those frames so that the retrieval of the auxiliary signal component can be realized.
The auxiliary signal component can also be split into sub-signals or sub-band signals prior to encoding into the transmission signal. In the case of an encoding into a sub-band signal, allocation information and scale factor information corresponding to time equivalent signal blocks of the auxiliary signal component should be transmitted as well.
The auxiliary signal component can be an additional audio signal component, such as a surround sound signal component.
It is a further object of the invention to provide a number of steps for the transmission system, in particular, a very specific choice for the format with which the digital wide-band signal, after conversion into the second digital signal, can be transmitted via the transmission medium, in such away that a flexible and highly versatile transmission system is obtained. This is to be understood to mean that the transmitter should be capable of converting wide-band digital signals of different formats (these formats differing, inter alia, with respect to the sample frequency Fs of the wide-band digital signal, which may have different values, such as, 32 kHz, 44. 1 kHz and 48 kHz, as laid down in the digital audio interface standard of the AES and the EBU) into the second digital signal. Similarly, the receiver should be capable of deriving a wide-band signal of the correct format from said second digital signal. To this end, the transmission system in accordance with the invention is characterized in that if P in the formula
P=BRxc3x97ns/Nxc3x97Fs 
is an integer, where BR is the bit rate of the second digital signal, and ns is the number of samples of the wideband digital signal whose corresponding information, which belongs to the second digital signal, is included in one frame of the second digital signal, the number of information packets B in one frame is P, and in that, if P is not an integer, the number of information packets in a number of the frames is Pxe2x80x2, Pxe2x80x2 being the next lower integer following P, and the number of information packets in the other frames is equal to Pxe2x80x2+1 so as to exactly comply with the requirement that the average frame rate of the second digital signal should be substantially equal to Fs/ns, and that a frame should comprise at least a first frame portion including the synchronizing information. The purpose of dividing the frames into B information packets is that, for a wide-band digital signal of an arbitrary sample frequency Fs, the average frame rate of the second digital signal transmitted by the transmitter is now such that the duration of a frame in the second digital signal corresponds to the duration occupied by ns samples of the wide-band signal. Moreover, this enables the synchronization to be maintained on an information-packet basis, which is simpler and more reliable than maintaining the synchronization on a bit basis. Thus, in those cases where P is not an integer, the transmitter is capable, at instants at which this possible and also necessary, to provide a frame with Pxe2x80x2+1 instead of Pxe2x80x2 information blocks, so that the average frame rate of the second digital signal can be maintained equal to Fs/ns. Since, in this case, the spacing between the synchronizing information (synchronizing signals or synchronizing words) included in the first frame portion of succeeding frames is also an integral multiple of the length of an information packet, it remains possible to maintain the synchronization on an information packet basis. Preferably, the first frame portion further contains information related to the number of information packets in a frame. In a frame comprising B information packets, this information may be equal to the value B. This means that this information corresponds to Pxe2x80x2 for frames comprising Pxe2x80x2 information packets, and to Pxe2x80x2+1 for frames comprising Pxe2x80x2+1 information packets. Another possibility is that this information corresponds to Pxe2x80x2 for all frames, regardless of whether a frame comprises Pxe2x80x2 or Pxe2x80x2+1 information packets. The additionally inserted (Pxe2x80x2+1)th information packet may comprise, for example, merely xe2x80x9czerosxe2x80x9d. In that case, this information packet does not contain any useful information. Of course, the additional information packet may also be filled with useful information. The first frame portion may further comprise system information. This may include the sample frequency Fs of the wide-band digital signal applied to the transmitter, copy-protection codes, the type of wide-band digital signal applied to the transmitter, such as a stereo-audio signal or a mono-audio signal, or a digital signal comprising two substantially independent audio signals. However, other system information is also possible, as will become apparent hereinafter. Including the system information makes it possible for the receiver to be also flexible and enables the received second digital signal to be correctly reconverted into the wide-band digital signal. The second and the third frame portions of a frame contain signal information. The transmitter may comprise a coder comprising signal-splitting means responsive to the wide-band digital signal to generate a second digital signal in the form of a number of M sub-signals, M being larger than 1, and comprising means for quantizing the respective sub-signals. For this purpose, an arbitrary transform coding, such as the fast Fourier transform (FFT), may be used. In that case, the transmission system is characterized in that the second frame portion of a frame contains allocation information which, for at least a number of sub-signals, indicates the number of bits representing the samples of the quantized sub-signals derived from said sub-signals, and in that the third frame portion contains the samples of at least said quantized sub-signals (if present). At the receiving end, it is then necessary to apply an inverse transform coding, for example, an inverse Fourier transform (IFFT), to recover the wide-band digital signal. The transmission system, in which the signal-splitting means takes the form of analysis-filter means responsive to the wide-band digital signal to generate a number of M sub-band signals, this analysis-filter means dividing the signal band of the wide-band digital signal, using a sample-frequency reduction, into successive sub-bands having band numbers m increasing with the frequency, and in which the quantization means is adapted to quantize the respective sub-band signals block by block, is a system employing sub-band coding as described above. Such a transmission system is characterized further in that, for at least a number of the sub-band signals, the allocation information in the second frame portion of a frame specifies the number of bits representing the samples of the quantized sub-band signals derived from said sub-band signals, and in that the third frame portion contains the samples of at least said quantized sub-band signals (if present). This means, in fact, that the allocation information is inserted in a frame before the samples. This allocation information is needed to enable the continuous serial bit stream of the samples in the third frame portion to be subdivided into the various individual samples of the correct number of bits at the receiving end. The allocation information may require that all samples are represented by a fixed number of bits per sub-band per frame. This is referred to as a transmitter based on fixed or static bit allocation. The allocation information may also imply that a number of bits variable in time is used for the samples in a sub-band. This is referred to as a transmitter based on the system of adaptive or dynamic bit allocation. Fixed and adaptive bit allocation are described, inter alia, in the publication xe2x80x9cLow bit-rate coding of high quality audio signals. An introduction to the MASCAM systemxe2x80x9d by G. Theile et al., EBU Technical Review, No. 230 (August 1988). Inserting the allocation information in a frame before the samples in a frame, has the advantage that, at the receiving end, a simpler decoding becomes possible, which can be carried out in real time and which produces only a slight signal delay. As a result of this sequence, it is no longer necessary to first store all the information in the third frame portion in a memory in the receiver. Upon arrival of the second digital signal, the allocation information is stored in a memory in the receiver. Information content of the allocation information is much smaller than the information content of the samples in the third frame portion, so that a substantially smaller store capacity is needed than in the case that all the samples would have to be stored in the receiver. Immediately upon arrival of the serial data stream of the samples in the third frame portion, this data stream can be divided into the various samples having the number of bits specified by the allocation information, so that no previous storage of the signal information is necessary. The allocation information for all the sub-bands can be included in a frame. However, this is not necessary, as will become apparent hereinafter.
The transmission system may be characterized further in that, in addition, the third frame portion includes information related to scale factors, a scale factor being associated with at least one of the quantized sub-band signals contained in the third frame portion, and in that the scale factor information is included in the third frame portion before the quantized sub-band signals. The samples can be coded in the transmitter without being normalized, i.e., without the amplitudes of a block of samples in a sub-band having been divided by the amplitude of the sample having the largest amplitude in this block. In that case, no scale factors have to be transmitted. If the samples are normalized during coding, scale factor information has to be transmitted to provide a measure of said largest amplitude. If, in this case, the scale factor information is also inserted in the third frame portion before the samples, it is possible that during reception, the scale factors to be derived from said scale information are first stored in a memory and the samples are multiplied immediately upon arrival, i.e., without a time delay, by the inverse values of said scale factors. The scale factor information may be constituted by the scale factors themselves. It is obvious that a scale factor as inserted in the third frame portion may also be the inverse of the amplitude of the largest sample in a block, so that in the receiver, it is not necessary to determine the inverse value and, consequently, decoding can be faster. Alternatively, the values of the scale factors may be encoded prior to insertion in the third frame portion as scale factor information and subsequent transmission. Moreover, it is evident that if, after quantization in the transmitter, the sub-band signal in a sub-band is zero, which obviously will be apparent from the allocation information for the sub-band, no scale factor information for this sub-band has to be transmitted. The transmission system, in which the receiver comprises a decoder comprising synthesis filter means responsive to the respective quantized sub-band signals to construct a replica of the wide-band digital signal, this synthesis filter means combining the sub-bands applying sample-frequency increase to form the signal band of the wide-band digital signal, may be characterized in that the samples of the sub-band signals (if present) are inserted in the third frame portion in a sequence corresponding to the sequence in which said samples are applied to the synthesis filter means upon reception in the receiver. Inserting the samples in the third frame portion in the same sequence as that in which they are applied to the synthesis filter means in the receiver also results in fast decoding, which again does not require additional storage of the samples in the receiver before they can be further processed. Consequently, the storage capacity required in the receiver can be limited substantially to the storage capacity needed for the storage of the system information, the allocation information and, if applicable, the scale factor information. Moreover, a limited signal delay is produced, which is mainly the result of the signal processing performed upon the samples. The allocation information for the various quantized sub-band signals is suitably inserted in the second frame portion in the same sequence as that in which the samples of the sub-band signals are included in the third frame portion. The same applies to the sequence of the scale factors. If desired, the frames may also be divided into four portions, the first, the second and the third frame portions being as described hereinbefore. The last (fourth) frame portion in the frame may then contain error-detection and/or error-correction information. Upon reception of this information in the receiver, it is possible to apply a correction for errors produced in the second digital signal during transmission. As already stated, the wide-band digital signal may be a monophonic signal. Alternatively, the wide-band digital signal may be a stereo audio signal made up of a first (left) channel component and a second (right) channel component. If the transmission system is based on a sub-band coding system, the transmitter will supply sub-band signals each comprising a first and a second sub-band signal component, which, after quantization in the quantization means, are converted to form first and second quantized sub-band signal components. In this case, the frames should also include allocation information and scale-factor information (if the samples have been scaled in the transmitter). The sequence is also important here. It is obvious that the system can be extended to handle a wide-band digital signal comprising more than two signal components.
The inventive steps may be applied to digital transmission systems, for example, systems for the transmission of digital audio signals (digital audio broadcast) via the ether. However, other uses are also conceivable. An example of this is a transmission via optical or magnetic media. Optical-media transmissions may be, for example, transmissions via glass fibers or by means of optical discs or tapes. Magnetic-media transmissions are possible, for example, by means of a magnetic disc or a magnetic tape. The second digital signal is then stored in the format as proposed by the invention in one or more tracks of a record carrier, such as an optical or magnetic disc or a magnetic tape. The versatility and flexibility of the transmission system thus resides in the special format with which the information in the form of the second digital signal is transmitted, for example, via a record carrier. This is combined with the special construction of the transmitter which is capable of generating this special format for various types of input signals. The transmitter generates the system information required for every type of signal and inserts this information in the data stream to be transmitted. At the receiving end, this is achieved by means of a specific receiver, which extracts said system information from the data stream and employs it for a correct decoding. The information packets then constitute a kind of fictitious units, which are used to define the length of a frame. This means that they need not be explicitly discernible in the information stream of the second digital signal. Moreover, the relationship of the information packets with the existing digital audio interface standard is as defined in the IEC Standard No. 958. This standard, as normally applied to consumer products, defines frames containing one sample of both the left-hand and the right-hand channels of a stereo signal. These samples are represented by means of 16-bit two""s complement words. If N=32 is selected, one frame of this digital audio interface standard can transmit exactly one information packet of the second digital signal. In the digital audio interface standard, the frame rate is equal to the sample rate. For the present purpose, the frame rate should be selected to be equal to BR/N. This enables the present IC""s employed in standard digital audio interface equipment to be used.