1. Field of the Invention
The present invention relates to a method and apparatus for providing and controlling sound in a spatial environment.
2. Background Art
When hearing sounds, people can hear not only information about the sounds themselves, but also about the locations of the sound sources producing the sounds. Often, however, systems for producing or reproducing sound cannot accurately convey the same senses of direction and position. Thus, a system is needed that can create the perception of sound sources being located in different places than the loudspeakers being used to provide the sound as well as the perception that the sound sources are moving with respect to the listener.
Humans derive much information about their environment from their sense of hearing. Even when one is deprived of other sensory inputs, one can obtain a good understanding of one's surroundings through hearing alone. One can sense direction, distance, amplitude, spectral content, and timing of sound sources. Such parameters are meaningful when they relate to an actual source sound that is physically located at a specific location.
However, it is often desirable to simulate sound sources and their locations. One example involves reproduction of sound. A recording may be made of an orchestra playing music. Since different instruments are located at different locations within the orchestra, it is important to convey the same sense of position during playback in order to achieve faithful reproduction.
Another example involves synthesis of sound that is intended to convey the same parameters that are sensed from actual sound sources. An attempt may be made to create the sense that a bumble bee or a helicopter is flying by the listener. Ideally, it would be possible to convey the sense that sound is originating from one or more locations around a listener. Since sound sources would be perceived to exist but would not require actual sound generating means to be at the same place the sound seems to be originating, such sound sources are referred to as apparent sound sources.
Initial attempts at synthesizing and reproducing sound were monophonic. Since sound was generated from a single loudspeaker located at a single location, the sound was perceived as coming from that single location, not the potentially many locations of the original sound sources. Thus, monophonic techniques lost all information relating to the position of the original sound sources.
To provide some sense of position, stereophonic reproduction was developed. For a stereophonic recording, two microphones were spaced horizontally and positioned in front of the original sound sources, for example, the instruments in an orchestra. Two separate recordings were made, one from each microphone. These two separate recordings were then played back over separate loudspeakers separated horizontally by some distance. When adjusted properly, the perceived locations of the individual sound sources would lie along a line segment between the two loudspeakers. As long as the locations of the original sound sources were along such a line segment, and assuming the two microphones had been positioned correctly, the positions of the perceived sound sources would roughly correspond to the positions of the original sound sources. However, actual sound sources are rarely constrained to such specific configurations. Thus, stereophonic techniques cannot accurately convey the position information associated with most configurations of original sound sources.
Attempts were also made to reproduce sound using quadraphonic techniques. Quadraphonic techniques record sound signals received from four microphones. The four sound signals are denoted left front, right front, left rear, and right rear. As with stereophonic techniques, two separate channels are used for the left front and right front sound signals. The left rear and right rear sound signals are modulated onto the left front and right front sound signals, respectively, and recorded on the same two channels used to record the left front and right front signals. By using only two recording channels, the quadraphonic techniques maintain backwards compatibility with stereophonic techniques. To reproduce sound, the two channels are played back, and the left rear and right rear signals are demodulated. The resulting four sound signals are fed to four loudspeakers to attempt to convey meaningful position information. However, quadraphonic recording techniques do not provide interactive control of the position information that the techniques are attempting to convey. There is no provision for interactive control in conjunction with synthesized images or computer-based game controls. Moreover, the requirement for four separate sound signals has traditionally hindered synthesis of sounds with controllable position information.
Another sound reproduction technique was developed to provide improved position perception without requiring more than the two (left and right) channels of traditional stereophonic techniques. The technique involves the use of three loudspeakers. Two loudspeakers are positioned in front of a listener and reproduce the left and right sound signals as described in relation to stereophonic techniques. A third loudspeaker is positioned behind the listener. A sound signal that is obtained either by subtracting the right sound signal from the left sound signal or by subtracting the left sound signal from the right sound signal is applied to the third loudspeaker. This sound signal that represents the difference between the left and right sound signals helps enhance position perception. As with quadraphonic techniques, this technique also does not provide interactive control of the position information attempted to be conveyed. There is also no provision for interactive control in conjunction with synthesized images or computer-based game controls. Furthermore, additional circuitry is needed to generate the sound signal representing the difference of the left and right sound signals.
Another attempt at providing sound that creates the impression of surrounding the listener (i.e., surround sound) was made using the three loudspeakers of the above technique in combination with one additional loudspeaker. The additional loudspeaker was a center loudspeaker between the left and right loudspeakers. The center loudspeaker helps reinforce the perception of sound coming from directly in front of a listener. Otherwise, this technique suffers from the same disadvantages as the three loudspeaker technique. A variation on this technique split the left minus right difference signal into two loudspeakers positioned behind the listener. This arguably offered some improvement in position perception, but did not otherwise overcome the disadvantages of the previous technique. Another variation on this technique used frequency dependent masking to convey information. Frequency dependent masking alters the natural audio frequency spectrum and distorts the sound signals being reproduced.
Attempts have also been made to widen the perceived positioning of stereophonic techniques. Some of these attempts have filtered the sound signals and introduced delays in the sound signals. Most of these techniques suffer from the disadvantage that they cannot create the perception that sound is emanating from a source behind the listener. Even those that can arguably cause some sensation that the sound is behind the listener cannot overcome an ambiguity in perceived location for sounds intended to be perceived as originating directly in front of and directly behind the listener (i.e., at azimuth bearings of 0.degree. and 180.degree.). Such techniques generally have the disadvantage of requiring the listener to be perfectly positioned with respect to the loudspeakers in order to achieve the desired perception. Some of these techniques even require specific parameters of the anatomy of the listener's ears to be determined before attempting to provide accurate position perception. These techniques are listener dependent, do not support multiple simultaneous listeners, and must be reconfigured to accommodate different listeners.
Some audio processing equipment utilizes panning controls to set the balance between sound levels generated by multiple loudspeakers. For instance, stereophonic systems often use a panning potentiometer commonly referred to as a balance control to control the relative levels of left and right loudspeakers. Some systems provide a panning potentiometer commonly referred to as a fader to control the relative levels of front and rear loudspeakers. Some sound studio equipment combines these panning controls into a single control mechanism. Nevertheless, such controls do not allow individual control of perceived position information for multiple simultaneous sound sources. Furthermore, such controls do not provide dynamic interaction with graphic images or computer-based game controls. Also, the relationship between gain and potentiometer position is generally fixed, preventing accurate dynamic control of perceived position, particularly for sound signals of widely varying amplitudes.
Thus, disadvantages abound with prior art techniques for attempting to control the perceived position of sound sources.