1. Field of the Invention
The invention relates to circuits and methods for processing binaural signals, and more particularly to a method and apparatus for converting a plurality of signals having no localization information into binaural signals, and further, for providing selective shifting of the localization position of the sound.
2. Description of the Prior Art
Human beings are capable of detecting and localizing sound source origins in three-dimensional space by means of their binaural sound localization ability. Although binaural sound localization provides orders of magnitude less information in terms of absolute three-dimensional dissemination and resolution than the human binocular sensory system, it does possess unique advantages in terms of complete, three-dimensional, spherical, spatial orientation perception and associated environmental cognition. Observing a blind individual take advantage of his environmental cognition through the complex, three-dimensional spatial perception constructed by means of his binaural sound localization system, is convincing evidence in terms of exploiting the sensory pathway in order to construct an artificial, sensory-enhanced, three-dimensional auditory display system.
The most common form of sound display technology employed today is known as stereophonic or "stereo" technology. Stereo was an attempt at providing sound localization display, whether real or artificial, by utilizing only one of the many binaural cues needed for human binaural sound localization--interaural amplitude differences. Simply stated, by providing the human listener with a coherent sound independently reproduced on each side of the head, be it by loudspeakers or headphones, any amplitude difference, artificially or naturally generated between the two sides, will tend to shift the perception of the sound towards the dominantly reproduced side.
Unfortunately, the creators of stereo failed to understand basic human binaural sound localization "rules" and stereo fell far short of meeting the needs of the two eared system in providing artificial cuing to the listener's brain in an attempt to fool it into believing it is hearing three dimensional location of sounds. Stereo more often is denoted as producing "a wall of sound" spread laterally in front of the listener, rather than a three-dimensional sound display or reproduction.
A theoretical improvement on the stereo system is the quadraphonic sound system which places the listener in the center of four loudspeakers: two to the left and right in front, and two to the left and right in back. At best, "quad" provides an enhanced sensation over stereo technology by creating an illusion to the listener of being "surrounded by sound." Other practical disadvantages of "quad" over the present invention are the increased information transmission, storage and reproduction capabilities needed for a four channel system rather than the two required in stereo or the two channels required by the technologies of this invention.
Many attempts have been made at creating more meaningful illusions of sound positioning by increasing the number of loudspeakers and discrete locations of sound emanation--the theory being, the more points of sound emanation the more accurately the sound source can be "placed." Unfortunately, again this has no bearing on the needs of the listener's natural auditory system in disseminating correct localization information.
In order to reduce the transmission and storage costs of multiple loudspeaker reproduction, a number of technologies have been created in order to matrix or "fold in" a number of channels of sound into fewer channels. Among others, a very popular cinema sound system in current use utilizes this approach, again failing to provide true three-dimensional sound display for the reasons previously discussed.
Because of the practical considerations of cost and complexity of multiple loudspeaker displays, the number of discrete channels is usually limited. Therefore, compromise is further induced in such displays until the point is reached that for all practical purposes the gains in sound localization perception are not much beyond "quad." Most often, the net result is the creation of "surround sound" illusions such as are employed in the cinema industry.
Another form of sound enhancement technology available to the end user and claiming to provide "three-dimensionality and spatial enhancement," etc. is in delay line and artificial reverberation units. These units, as a norm, take a conventional stereo source and either delay or provide reverberation effects which are reproduced primarily from the rear of the listener over an additional pair (or pairs) of loudspeakers, the claimed advantage being that of placing the listener "within the concert hall."
Although sound enhancement technologies do construct some form of environmental ambience for the listener, they fall far short of the capability of three-dimensionally displaying the primary sounds so as to binaurally cue the listener's brain.
A good method of providing true, three-dimensional sound recordings and reproduction from within an acoustical environment is via binaural recording; a technique which has been known for over fifty years. Binaural recording utilizes a two channel microphone array that is contained within the shell of an anthropometric mannequin. The microphones are attached to artificial ears that mimic in every way the acoustic characteristics of the human external auditory system. Very often, the artificial ears are made from direct ear molds of natural human ears. If the anthropometric model is exactly analogous to the natural external auditory system in its function of generating binaural localization cues, then the "perception" and complex binaural image so generated can be reproduced to an listener from the output of the microphones mimicking the eardrums. The binaural image constructed by the anthropometric model, when reproduced to an listener by means of headphones and, to a lesser extent, over loudspeakers, will create the perception of three-dimensionality as heard not by the listener's own ears but by those of the anthropometric model.
There are three major shortcomings of binaural recording technology:
(a) The binaural recording technology requires that the audio signals be airborne acoustical sounds that impinge upon the anthropometric model at the exact angle, depth and acoustic environment that is to be perceived relative to the model. In other words, binaural recording technology documents the dimensionality of sound sources from within existing acoustical environments.
(b) Second, binaural recording technology is dependent upon the sound transform characteristics of the human ear model utilized. For example, often it is hard for an listener to readily localize a sound source as in front or behind--there is front-to-back localization confusion. On the binaural recording array, the size and protuberance of the ears' pinna flange have a lot to do with the cuing transfer of front-to-back perception. It is very difficult to enhance the pinna effects without causing physical changes to the anthropometric model. Even if such changes are made, the front-to-back cue would be enhanced at the expense of the rest of the cuing relations.
(c) Third, binaural recording arrays are incapable of mimicking the listener's head motion utilized in the binaural localization process. Head motion by the listener is known to increase the capabilities of the sound localization system in terms of ease of localization, as well as absolute accuracy. The advantages of head motion in the sound localization task are gained by the "servo feedback" provided to the auditory system in the controlled head motion. The listener's head motion creates changes in binaural perception that disseminate additional layers of information regarding sound source position and the observed acoustical environment.
In general, binaural recording is incapable of being adapted for practical display systems--a display in which the sound source position and environmental acoustics are artificially generated and under control.