1. Field of the Invention
The present invention relates to apparatus for and method of processing audio signal for use with video game machines, personal computers and the like and in which a sound image of a sound source signal is localized virtually.
2. Description of the Related Art
In general, when virtual reality is realized by sounds, there is known a method in which a monaural audio signal is processed by suitable signal processing such as filtering, so that a sound image can be localized not only between two speakers but also at any positions of a three-dimensional space for a listener by using only two speakers.
When a monaural audio signal is processed by proper filtering based on transfer functions (HRTF: Head Related Transfer Function) from a position at which a sound image of an inputted monaural audio signal is localized to listener's ears and transfer functions from a pair of speakers located in front of listener to listener's ears, a sound image can be localized even at any place other than the positions of a pair of speakers such as in the rear of and in the side of listener. In the specification of the present invention, this technique will be referred to as a “virtual sound image localization”. Reproducing devices maybe speakers, headphones or earphones worn by a listener. When through headphones a listener listens to reproduced sounds of audio signal which has not been processed by this signal processing, there occurred a so-called “in-head localization” of reproduced sound image. If the above processing is effected on the audio signal, then a reproduced sound image can provide “out-head localization” similar to the sound image localization obtained by the speakers. Moreover, it becomes possible to localize a sound image at an arbitrary position around the listener similarly to the virtual sound image localization done by the speakers. Although contents of signal processing become slightly different in response to respective reproducing devices, resulting outputs become a pair of audio signals (stereo audio signals). Then, when the above audio signals, i.e., stereo audio signals are reproduced by a pair of appropriate transducers (speakers or headphones), a sound image can be localized at an arbitrary position. Of course, inputted signals are not limited to the monaural audio signal. As will be described later on, a plurality of sound source signals are filtered in accordance with respective localization positions and can be added together so that a sound image can be localized at an arbitrary position.
Furthermore, when multi-channel speakers are located around the listener and sound source signals are properly assigned to these channels, desired sound images can be localized.
On the other hand, there is known a method in which images and sound images can be localized by using the above technique as the user is operating the reproducing device.
In accordance with enhancement of throughput of recent processors and in accordance with a producer's demand and seeking for reproducing more complex and realer virtual reality, processing itself becomes advanced and more complex increasingly.
Since the sound virtual localization method which becomes the above fundamental technology assumes an original monaural sound signal as a point sound source, when the producer intends to express a sound source of large size which cannot be reproduced by a point sound source in order to localize a sound source near a set of sound sources with complex arrangement and a listener, a set of sound sources are divided and held as a plurality of point sound sources T1, T2, T3, T4 beforehand and a plurality of point sound sources are virtually localized separately. Then, as shown in FIG. 1, a sound signal is produced by effecting synthesizing processing such as mixing on these point sound sources.
Let us assume a set of sound sources comprised of four point sound sources T1, T2, T3, T4 as shown in FIG. 2, for example. When the position of this set is moved or rotated, virtual sound images of all point sound sources T1, T2, T3, T4 are localized and sound images are localized for a listener M at the positions shown by T11, T21, T31, T41.
When position relationships of the respective sound sources comprising this set are transformed, virtual sound images of all point sound sources T1, T2, T3, T4 are similarly localized, whereby sound images are localized for the listener M at positions shown by T12, T22, T32, T42 in FIG. 2.
However, according to the above method, when virtual sound image localization of a realized sound source object (sound source having position information and the like) becomes more complex and the number of the point sound sources increases, an amount of signals to be processed becomes huge to oppress other processing, otherwise an amount of signals to be processed exceeds an allowable signal processing amount so that the audio signal processing apparatus becomes unable to reproduce an audio signal.