1. Field of the Invention
The present invention relates to three-dimensional sound processing systems, and more specifically, to a three-dimensional sound processing system which provides a listener with three-dimensional sound effects by reproducing a sound image properly positioned in a reproduced sound field.
2. Description of the Related Art
To precisely recreate sound images, or to achieve accurate acoustic image positioning, it is necessary in general for sound processing systems to acquire acoustic characteristics both in the original sound field, where original sound signals are recorded, and in a reproduced sound field reproduced from the recorded sound signals. The characteristics of an original sound field are expressed by what is known as a head-related transfer function (HRTF), which represents relationships between sound signals produced by a sound source and those heard by a listener. The reproduced sound field involves some audio output devices such as speakers and headphones, which have some specific acoustic characteristics. Those characteristics of the original and reproduced sound fields are measured in advance with an appropriate procedure and programmed into the sound processing systems.
When outputting the recorded source sound signals in the reproduced sound field, the sound processing system adds the acoustic characteristics measured in the original sound field to those source sound signals. The system also subtracts, in advance, the acoustic characteristics of the reproduced sound, field from the source sound signals. Using speakers or headphones, listeners can hear the processed sound, where the recreated sound images are positioned right at the sound source locations in the original sound field.
FIG. 14 shows an example of an original sound field, in which a single sound source (S) 101 and a listener 102 are involved. As seen in this FIG. 14, there are two spatial sound paths from the sound source (S) 101 to each tympanic membrane of the left (L) and right (R) ears of the listener 102, whose acoustic characteristics are expressed by their respective head-related transfer function S.sub.L and S.sub.R.
FIG. 15 shows an example of a reproduced sound field which is produced by a conventional sound processing system using a headphone consisting of a pair of earphones. Two filters 103 and 104 with a transfer function (S.sub.L, S.sub.R) will add to the entered sound signals some acoustic characteristics concerning the sound paths from the sound source 101 to the listener 102, which are previously measured in the original sound field. The other two filters 105 and 106, on the other hand, will subtract from the sound signals the acoustic characteristics of sound paths from earphones 107a and 107b to both ears of a listener 108, which are represented by a transfer function (h, h). Thus the filters 105 and 106 have the inverse transfer function of (h, h), namely, (h.sup.-1, h.sup.-1).
Input signals, carrying a sound information identical to the original sound from the sound source 101, are separated into the left and right channels and fed to the above-described filters 103-106. A sound image 109 reproduced by the earphones 107a and 107b will sound to the listener 108 as if it were placed at just the same location as the sound source 101 shown in FIG. 14.
The filters 103-106 are implemented as finite impulse response (FIR) filters, each comprising, as shown in FIG. 16, a plurality of delay units (Z.sup.-1) 110-112 each made up with several flip-flops or the like, a plurality of multipliers 113-116, a summation unit 117, and an adder 118. Multiplier coefficients aO-an given to the respective multipliers 113-116 are obtained from the acoustic characteristics, or impulse response, of each spatial sound path. To obtain the coefficients for the filters (S.sub.L, S.sub.R) 103 and 104, the impulse responses should be measured for two spatial sound paths in the original sound field as illustrated in FIG. 14. To determine the coefficients for the FIR filters (h.sup.-1, h.sup.-1) 105 and 106, it is necessary to measure the impulse responses of two spatial sound paths from the earphones 107a and 107b to both tympanic membranes of the listener 108. Then their respective inverse responses should be computed. More specifically, the impulse responses of the two spatial sound paths from the headphones 107a and 107b to the listener's both tympanic membranes are measured and transformed into frequency domain, where their respective inverse functions are calculated. The calculated inverse functions are then reconverted into time domain to yield the filter coefficients.
Such conventional three-dimensional sound processing systems, however, have some shortcomings in their ability to position the sound image, as will be clarified as follows.
The human hearing system generally shows low sensitivity in locating a sound source in the vertical and front-to-rear directions, while exhibiting excellent ability in the side-to-side direction. Therefore, the listener would use visual information to locate a sound source in the front-to-rear direction or attempt to detect it by turning his/her head to the right or left to cause some difference in sound perception.
In the case where the listener is not in the original sound field but in a reproduced sound field, it is not possible to use visual information because there is no visual image of the original sound source. Even if the listener turns his/her head while wearing a headphone, it will cause no change in the acoustic characteristics of the reproduced sound field. Also, when speakers are used to recreate a sound field, the reproduced sound field is programmed assuming that a listener's head is oriented at a prescribed azimuth angle, and thus the rotation of his/her head will violate this assumption.
Therefore, in conventional three-dimensional sound processing systems, it is difficult to achieve effective positioning of a sound image in the front-to-rear direction with respect to a listener.
The applicant of the present invention proposed a three-dimensional sound processing system in the Japanese Patent Application No. Hei 7-231705 (1995). According to this patent application, the system computes appropriate filter coefficients that approximately represent poles (or peaks) and zeros (or dips) in an amplitude spectrum as part of the frequency-domain representation of an impulse response measured in the original sound field. Using such coefficients, it is possible to form infinite impulse response (IIR) filters and FIR filters with fewer taps to add the acoustic characteristics of the original sound field to the reproduced sound field. This filter design technique will reduce the amount of data to be processed by the filters and also enable miniaturization of memory circuits required in the filters. The use of such reduced-tap filters, however, does not always provide sufficient sound image positioning capability in the front-to-rear direction.
Meanwhile, conventional sound processing systems adjust the amplitude and reverberation of sounds to control the distance perspective of a sound image. To adjust reverberation, the systems are equipped with FIR filters having coefficients corresponding to an impulse response representing reverberation. Those FIR filters, however, have to process a large amount of data, which consumes a lot of memory, in order to achieve a desired performance.
Conventional sound processing systems also vary the loudness and pitch of a sound to allow the listener to feel the motion of a sound image. They simulate the Doppler effect by appropriately controlling the pitch of the sound. That is, a raised pitch expresses a sound source that is coming close to the listener, while a lowered pitch represents a sound source that is leaving the listener. To change the pitch of the sound, conventional sound processing systems employ a ring buffer 119 as illustrated in FIG. 17, which provides a predetermined amount of memory to temporarily store the sound data. The ring buffer 119 is equipped with a write pointer to generate a new memory address at a constant operating rate, thereby writing sound data into consecutive memory addresses. The ring buffer 119 also has a read pointer to provide a memory address for reading out the sound data, whose operating rate is controlled according to the required pitch of the sound. That is, the read pointer must operate faster to obtain a higher pitch, and slower to yield a lower pitch, thus changing the frequency of a sound signal.
This ring buffer 119, however, has a potential problem of overflowing or underflowing. When the sound image is rapidly approaching the listener, the read pointer will move much faster than the write pointer moves, to create a higher pitch to simulate the Doppler effect. Just similar to this, when the sound image is rapidly leaving the listener, the read pointer will move much slower than the write pointer moves. As a result, the read pointer will overtake the write pointer, or vise versa. To prevent this extreme case from happening, the ring buffer 119 must have enough memory capacity, which increases the cost of sound processing systems.