1. Technical Field
The present invention relates to signal processing devices for decoding a coded signal that is generated by coding a downmixed signal of a plurality of signals and information for dividing the downmixed signal into the original signals. The present invention particularly relates to techniques of decoding a coded signal that is generated by coding a phase difference and a level ratio between signals to realize coding of multichannel realism with a small amount of information.
2. Background Art
A technique called a spatial codec (spatial coding) has been developed in recent years. This technique aims for compression coding of multichannel realism with a very small amount of information. For example, while AAC, which is a multichannel codec already widely used as a digital television audio format, requires a bit rate of 512 kbps or 384 kbps for 5.1 channels, the spatial codec is intended for compression coding of multichannel signals at a very low bit rate such as 128 kbps, 64 kbps, or even 48 kbps.
As a technique for achieving this aim, for instance, a technique disclosed in Parametric Coding for High Quality Audio (Non-patent Document 1) standardized in MPEG Audio has been put to use. Non-patent Document 1 describes a process of decoding a signal that is generated by coding a phase difference and a level ratio between channels so as to realize compression coding of realism with a small amount of information.
FIG. 1 is a diagram showing a process of a conventional signal processing device disclosed in Non-patent Document 1.
Input signal S is a result of downmixing original signals of 2 channels into a monaural signal. Input signal S is inputted to a processing module called decorrelation, as a result of which output signal D is obtained.
Though decorrelation is described in detail in section 8.6.4.5.2 “Calculate decorrelated signal” in Non-patent Document 1 and so its detailed explanation has been omitted here, decorrelation is roughly made up of two processes.
A first process is delaying. This is a process of delaying an input signal by a predetermined time period. The delayed signal is then subject to a second process called all pass filtering. All pass filtering is a process of decorrelating an input signal and also providing a reverberation component to the input signal.
Such generated signal D and input signal S are submitted for a process called mixing. Though this process too is described in detail in section 8.6.4.6.2 “Mixing” in Non-patent Document 1 and so its detailed explanation has been omitted here, two signals S and D are multiplied by coefficients h11, h12, h21, and h22 and multiplication results are added, as a result of which a L channel signal and a R channel signal are output. Expressions for this calculation are shown in the drawing.
Here, coefficients h11, h12, h21, and h22 are determined by level ratio L and phase difference θ between the original signals of 2 channels from which the input monaural signal is derived. According to a method currently under standardization in MPEG, coefficients h11, h12, h21, and h22 are obtained according to the following expressions.
Let θ beθ=arc cos(r)
where r denotes a correlation between the original signals of 2 channels.
Also, let δ beδ=arc tan((1−L)/(1+L)*tan(θ/2)).Thenh11=L/(1+L*L)0.5*cos(δ+θ/2)h21=L/(1+L*L)0.5*sin(δ+θ/2)h12=1/(1+L*L)0.5*cos(δ−θ/2)h22=1/(1+L*L)0.5*sin(δ−θ/2).
The above expressions correspond to a method that has evolved from a mixing coefficient calculation method described in Non-patent Document 1. Which is to say, the above expressions correspond to a mixing coefficient calculation method in a spatial codec, which is currently under standardization in MPEG.
As a result of the above process, when generating signals of 2 channels from a monaural signal, the delay and the reverberation addition in decorrelation produce such an effect that provides a sense of spaciousness and delivers favorable stereo signals.
Non-patent Document 1: ISO/IEC 14496-3: 2001/FDAM 2: 2004(E)
However, the above method has the following problems.
In a case where the input signal has an extremely sharp time variation (such as an instant at which a metal percussion instrument is struck), due to the effect of the delay and reverberation addition in the decorrelation process, the decorrelated signal loses the sharpness of the input signal. Since this decorrelated signal and input signal S are added in the mixing process that follows the decorrelation process, the resulting output signals will end up losing the sharpness of the input signal.
Likewise, in a case where frequency components of the input signal unevenly concentrate in a specific frequency band (such as when a timbre of one type of instrument continues), although a sound image of highly precise localization must be created, the effect of the delay and reverberation addition in the decorrelation process causes the sound image of precise localization to be blurred in the decorrelated signal. Since this decorrelated signal and input signal S are added in the mixing process that follows the decorrelation process, the resulting output signals will end up having a blurred sound image.
Also, the decorrelation process is structured by a filter with a large number of taps in order to add a reverberation component. This requires an extremely large amount of computation.
Furthermore, the process of obtaining coefficients h11, h12, h21, and h22 from the information about the level ratio and the phase difference involves making a complex correlation between a plurality of trigonometric functions that are arc cos( ), arc tan( ), tan( ), sin( ), and cos( ), as mentioned above. This requires a significantly large amount of computation, too.
The present invention was conceived in view of the above conventional problems. A first object of the present invention is to provide a signal processing device that can, when generating signals of 2 channels from a monaural signal, realize sharpness of a time variation of a sound and precise localization of a sound image, while providing a sense of spaciousness and producing favorable stereo signals.
A second object of the present invention is to reduce the amount of computation for the decorrelation process.
A third object of the present invention is to reduce the amount of computation for the process of obtaining coefficients h11, h12, h21, and h22.