1. Field of the Invention
The present invention generally relates to sound image localization method/apparatus and also a sound image control apparatus. More specifically, the present invention is directed to a sound image localization apparatus and a sound image localization method, capable of localizing a sound image at an arbitrary position within a three-dimensional space, which are used in, for instance, electronic musical instruments, game machines, and acoustic appliances (e.g. mixers). Also, the present invention is directed to a delay amount control apparatus for simulating an inter aural time difference changed in connection with movement of a sound image based upon variation of a delay amount, and also to a sound image control apparatus for moving a sound image by employing this delay amount control apparatus.
2. Description of the Related Art
Conventionally, such a technical idea is known in the field that 2-channel stereophonic signals are produced, and these stereophonic signals are supplied to right/left speakers so as to simultaneously produce stereophonic sounds, so that sound images may be localized. In accordance with this sound image localization technique, the sound images are localized by changing the balance in the right/left sound volume, so that the sound images could be localized only between the right/left speakers.
To the contrary, very recently, several techniques have been developed by which sound images can be localized at an arbitrary position within a three-dimensional space. As one of sound image localization apparatus using this conventional sound image localization technique, an input signal is processed by employing a head related acoustic transfer function so as to localize a sound image. In this case, a head related acoustic transfer function implies such a function for indicating a transfer system defined by such that a sound wave produced from a sound source receives effects such as reflection, diffraction, and resonance caused by a head portion, an external ear, a shoulder, and so on, and then reaches an ear (tympanic membrane) of a human body.
In this conventional sound image localization apparatus, when sounds are heard by using a headphone, first to fourth head related acoustic transfer functions are previously measures. That is, the first head related acoustic transfer function of a path defined from the sound source to a left ear of an audience is previously measured. The second head related acoustic transfer function of a path defined from the sound source to a right ear of the audience is previously measured. The third head related acoustic transfer function of a path defined from a left headphone speaker to the left ear of the audience is previously measured, and the fourth head related acoustic transfer function of a path defined from the right headphone speaker to the right ear of this audience is previously measured. Then, the signals supplied to the left headphone speaker are controlled in such a manner that the sounds processed by employing the first head related acoustic transfer function and the third head related acoustic transfer function are made equal to each other near the left external ear of the audience. Also, the signals supplied to the right headphone speaker are controlled in such a manner that the sounds processed by employing the second head related acoustic transfer function and the fourth head related acoustic transfer function are made equal to each other near the right external ear of the audience. As a consequence, the sound image can be localized at the sound source position.
When the sounds are heard by using speakers, head related acoustic transfer functions of paths defined from the left speaker to the right ear and from the right speaker to the left ear are furthermore measured. While employing these head related acoustic transfer functions, the sounds which pass through these paths and then reach the audience (will be referred to as "crosstalk sounds" hereinafter) are removed from the sounds produced by using the speakers. As a consequence, since a similar sound condition to that of the headphone can be established, the sound image can be localized at the sound source position.
One example of the above-described conventional sound image localization apparatus is shown in FIG. 1. In FIG. 1, a data memory 50 stores a plurality of coefficient sets. Each coefficient set is constructed of a delay coefficient, a filter coefficient, and an amplification coefficient. Each of these coefficient sets corresponds to a direction of a sound source as viewed from an audience, namely a direction (angle) along which a sound image is localized. For instance, in such a sound image localization apparatus for controlling the sound image localization direction every 10 degrees, 36 coefficient sets are stored in this data memory. The externally supplied sound image localization direction data may determine which coefficient set is read out from this data memory. Then, the delay coefficient contained in the read coefficient set is supplied to a time difference signal producing device 51, the filter coefficient is supplied to a left head related acoustic transfer function processor 52 and also to a right head related acoustic transfer function processor 53, and further the amplification coefficient is supplied to a left amplifier 54 and a right amplifier 55.
The time difference signal producing device 51 is arranged by, for example, a delay device, and may simulate a difference between a time when a sound produced from a sound source reaches a left ear of an audience, and another time when this sound reaches a right ear of this audience (will be referred to as an "inter aural time difference" hereinafter). For example, both a monaural input signal and a delay coefficient are inputted into this time difference signal producing device 51.
In this case, a direction of a sound source as viewed from an audience, namely a direction (angle) along which a sound image is localized will now be defined, as illustrated in FIG. 2. In this case, it is assumed that a front surface of the audience is a zero (0) degree. In general, an inter aural time difference becomes minimum when the sound source is directed to the zero-degree direction, is increased while the sound source is changed from this zero-degree direction to a 90-degree direction, and then becomes maximum in the 90-degree direction. Furthermore, the inter aural time difference is decreased while the sound source is changed from this 90-degree direction to a 180-degree direction, and then becomes minimum in a 180-degree direction. Similarly, the inter aural time difference is increased while the sound source is changed from the 180-degree direction to a 270-degree direction, and then becomes maximum in this 270-degree direction. The inter aural time difference is decreased while the sound source is changed from the 270-degree direction to the zero-degree (360-degree) direction, and then becomes minimum in the zero-degree direction again. The delay coefficients supplied to the time difference signal producing device 51 own values corresponding to the respective angles.
When the sound image localization direction data indicative of a degree larger than, or equal to 0 degree, and smaller than 180 degrees is inputted, the time difference signal producing device 51 directly outputs this input signal (otherwise delays this input signal only by a predetermined time) as a first time difference signal, and also outputs a second time difference signal delayed from this first time difference signal only by such an inter aural time difference corresponding to the delay coefficient. Similarly, when the sound image localization direction data indicative of a degree larger than, or equal to 180 degrees, and smaller than 360 degrees is inputted, the time difference signal producing device 51 directly outputs this input signal (otherwise delays this input signal only by a predetermined time) as a second time difference signal, and also outputs a first time difference signal delayed from this second time difference signal only by such an inter aural time difference corresponding to the delay coefficient. The first time difference signal produced from the time difference signal producing device 51 is supplied to the left head related acoustic transfer function processor 52, and the second time difference signal produced therefrom is supplied to the right head related acoustic transfer function processor 53.
The left head related acoustic transfer function processor 52 is arranged by, for instance, a six-order FIR filter, and simulates a head related acoustic transfer function of a sound entered into the left ear of the audience. The above-described first time difference signal and a filter coefficient for a left channel are entered into this left head transfer function processor 52. The left head related acoustic transfer function processor 52 convolutes the impulse series of the head related acoustic transfer function with the input signal by employing the filter coefficient for the left channel as the coefficient of the FIR filter. The signal processed from this left head related acoustic transfer function processor 52 is supplied to an amplifier 54 for the left channel.
The right head related acoustic transfer function processor 53 simulates a head related acoustic transfer function of a sound entered into the right ear of the audience. The above-described second time difference signal and a filter coefficient for a right channel are entered into this right head transfer function processor 53, which is different from the left head related acoustic transfer function processor 52. Other arrangements and operation of this right head related acoustic transfer function processor 53 are similar to those of the above-explained left head related acoustic transfer function processor 52. A signal processed from this right head related acoustic transfer function processor 53 is supplied to an amplifier 55 for a right channel.
The amplifier 54 for the left channel simulates a sound pressure level of a sound entered into the left ear of the audience, and outputs the simulated sound pressure level as the left channel signal. Similarly, the amplifier 55 for the right channel simulates a sound pressure level of a sound entered into the right ear, and outputs the simulated sound pressure level as the right channel signal. With employment of this arrangement, for instance, when the sound source is directed along the 90-degree direction, the sound pressure level of the sound entered into the left ear becomes maximum, whereas the sound pressure level of the sound entered into the right ear becomes minimum.
In accordance with the sound image localization apparatus with employment of above-explained arrangement, when the sounds are heard by using the headphone, no extra device is additionally required, whereas when the sounds are heard by using the speakers, the means for canceling the crosstalk sounds is further provided, so that the sound image can be localized at an arbitrary position within the three-dimensional space.
However, since the left head related acoustic transfer function processor and the right head related acoustic transfer function processor are separately provided in this conventional sound image localization apparatus, 12-order filters are required in total. As a result, in such a case that these right/left head related acoustic transfer function processors are constituted by using the hardware, huge amounts of delay elements and amplifiers are required, resulting in the high-cost and bulky sound image localization apparatus. In the case that the right/left head related acoustic transfer function processors are constituted by executing software programs by a digital signal processor (will be referred to as a "DSP" hereinafter), a very large amount of processing operations is necessarily required. As a consequence, since such a DSP operable in high speeds is required so as to process the data in real time, the sound image localization apparatus becomes high cost.
Furthermore, since the coefficient sets must be stored every sound image localization direction, such a memory having a large memory capacity is required. To further control the direction (angle) along with the sound image is localized in order to improve the precision of the sound image localization, a memory having a further large memory capacity is needed. There is another problem that the real time data processing operation is deteriorated, because the coefficient sets must be replaced every time the direction along which the sound image is localized is changed.
On the other hand, another conventional sound image localization apparatus capable of not only localizing the sound image, but also capable of moving the sound image has been developed. As such an apparatus to which the technique for moving the sound image has been applied, for instance, Japanese Laid-open Patent Application (JP-A-Heisei 04-30700) discloses the sound image localization apparatus. This disclosed sound image localization apparatus is equipped with sound image localizing means constituted by delay devices and higher-order filters. The head related acoustic transfer function is simulated by externally setting the parameters arranged by the delay coefficient and the filter coefficient. This head related transfer coefficient will differ from each other, depending upon the localization positions of the sound image as viewed from the audience. Therefore, in order that the sound image is localized at a large number of positions, this conventional sound image localization apparatus owns a large quantity of parameters corresponding to the respective localization positions.
In general, when a localization position of a sound image is moved from a present position to a new position, a parameter corresponding to this new position may be set to the sound image localization means. However, if the parameter is simply set to the sound image localization means while producing the signal, then discontinuous points are produced in the signal under production, which causes noise. To avoid this problem, this conventional sound image localization apparatus is equipped with first sound image localization means and second sound image localization means, and further means for weighting the output signals from the respective sound image localization means by way of the cross-fade system.
It is now assumed that the sound image is localized at the first position in response to the first localization signal derived from the first sound image localization means. When this sound image is moved to the second position, the weight of "1" is applied to the first localization signal derived from the first sound image localization means, and also the weight of "0" is applied to the sound localization signal derived from the second sound image localization means. Under these conditions, the parameter used to localize the sound image to the second position is set to the second sound image localization means. Since the second localization signal is weighted by "0", there is no possibility that noise is produced in the second localization signal when the parameter is set.
The weight of the first localization signal is gradually decreased from this state, and further the weight of the second localization signal is gradually increased. Then, after a predetermined time has elapsed, the weight to be applied to the first localization signal is set to "0", and the weight to be applied to the second localization signal is set to "1". As a result, moving of the sound image from the first position to the second position is completed without producing the noise.
The above-described sound image moving process is normally carried out by employing, for example, a DSP. In this case, the digital input signal is entered into the first and second sound image localization means every sampling time period. As a result, this DSP must process a single digital signal within a single sampling time period. For example, if the input signal is obtained by being sampled at the frequency of 48 kHz, the sampling time period becomes approximately 21 microseconds. Therefore, this DSP must perform the following process operation every approximately 21 microseconds, namely, the first localization signal is produced and weighted, and the second localization signal is produced and weighted. After all, there is another problem that the high cost DSP operable in high speeds is necessarily required in this conventional sound image localization apparatus.