Field of Invention
This invention relates generally to stereophonic reproduction, and more particularly to an acoustic projection system in which a reproducer array operating in conjunction with a signal processor coupled to the stereophonic signal channels acoustically radiates sounds in a manner exposing the listener, almost without regard to his position relative to the array, to a sound-image substantially recreating the three-dimensional ambience of the original sound source.
The principles underlying stereophonic sound are comparable to those involved in stereovision, for just as a scene is perceived by two spatially-separated eyes to produce a disparity in the resultant retinal images, sound from a source is sensed by spaced-apart ears, as a result of which the sounds entering the ears differ somewhat in their intensity, time of arrival and phase, thereby giving rise to disparate aural impressions.
While it is possible to faithfully create stereoscopic images, modern stereophonic sound reproduction is a misnomer despite its high level of technological sophistication, for the reproduced sound is lacking in depth or presence and is not truly three-dimensional. In order therefore to understand the deficiencies of present day stereophonic reproduction, one must first analyze the essential nature of stereovision.
The British scientist, Charles Wheatstone, in a paper published in 1883, not only set forth the general theory of binocular vision, but in the same paper suggested a method of devising a stereoscope. As Wheatstone pointed out, when a single eye is directed toward an object such as a sphere of a given size, the brain may then interpret the retinal image of this object either as one that is large but far away, or one that is near but small. When however, the observer looks at this object with both eyes, it becomes fixed in direction, size, shape and distance, the object then being perceived in three-dimensions. Depth is therefore an indispensable component of stereovision.
The axes of the two eyes of the observer form an angle known as the ocular parallax, the eyes seeing slightly different images. Yet the disparity in the two retinal images does not result in blurring or confusion, for the brain in a manner not fully understood, compares and combines the two separate sets of sensations to gain a three-dimensional perception of the object. Stereoscopic depth judgments are remarkably precise, consistent with the finding that all but the smallest eye movements and accommodative adjustments are highly correlated between the two eyes, and the neural units in the visual brain are precisely connected by way of intermediate synapses to optically corresponding areas.
In a stereoscopic camera, use is made of two lenses separated by 6.5 cm, this being the mean interocular distance. In stereoscopic motion pictures, the separate pictures recorded with a two-lens camera are cast on the same screen through projection lenses covered by a pair of Polaroid discs whose axes are perpendicular to one another, so that the projected pictures are polarized differently. The viewer looks at this screen through glasses fitted with a second set of Polaroid discs, also oriented with their axes at right angles. Thus one eye of the viewer sees one "flat" picture and the other a second "flat" picture. Simultaneous perception of the two different flat pictures by the eyes yields the desired depth or three-dimensional visual effect.
For a stereoscopic system to function at all, it is vital that there exist mutually exclusive eye images of the two "flat" pictures. Whether such exclusion is effected by geometric optics, color optics or polarization, visual intermingling the two pictures must scrupulously be avoided.
In listening with both ears to an orchestra whose instrumentalists are dispersed, the brain sorts out differences in the sounds picked up by the ears so that the listener not only senses the directions from which the various sounds come, but also the relative positions of the instrumentalists. Hence normal hearing as well as seeing involves the sensation of perspective or depth.
The ability to localize the direction of a sound source depends to a degree on the facility of the human auditory system to recognize differences in loudness or sound intensity, as well as differences in phase and in the quality of sounds when they are complex. Localization of sounds by the auditory system is also a function of sound frequency. Thus the ability of the human auditory system to localize the direction of sound produced by loudspeakers is relatively poor in the low end of the audio range (i.e., up to 400 Hz). Between 400 and 1000 Hz, sound sources are localized by detecting phase differences in the sounds reaching the ears; while above 1000 Hz, one depends mainly on differences in the loudness of the sounds reaching the ears (See Electronic Engineers' Handbook--D. G. Fink--McGraw-Hill, 1st Edition).
If, therefore, one listens to a live radio broadcast or to a recording of an orchestra through a single loudspeaker, all sounds will then issue from the direction of the speaker regardless of the placement of the microphone or microphones picking up the original orchestra sounds. Consequently, the reproduced sounds will be altogether lacking in directionality, to say nothing of depth.
Stereophony is generally thought to be a recent technological innovation. But the fact is that stereophony can be traced back to Clement Ader who in 1881 demonstrated at an International Exhibition of Electricity in Paris the first binaural telephone--linked system in which a performance at the Paris opera was transmitted over two channels to listeners at the exhibition. (See J. Audio Eng. Soc.--May, 1981--"100 Years with Stereo--The Beginning").
In a typical modern stereophonic system for broadcasting or recording an orchestra, two microphones are placed at left and right sites relative to the sound source, and the microphone signals are conveyed through left and right channels to maintain a separation therebetween. By reproducing the channel outputs through two loudspeakers placed reasonably far apart at left and right positions, the listener, if properly situated relative to the speakers, will interpret the speaker sounds as coming from directions that depend on the position of the orchestral instruments relative to the two microphones.
Throughout this specification, an original source such as an orchestra will be divided into a middle or central zone flanked by left and right zones. Sound emanating from these zones are three-dimensional, for the instrumentalists are not positioned along a common line but are dispersed, and the sounds heard include those reflected from the surfaces of the performing chamber as well as directly from the players. The left and right microphones are adjacent the left and right zones and therefore are more immediately responsive to the sounds therefrom than sounds which both microphones later pick up from the central zone. By "more immediately responsive" is meant that the sounds from the left and right zones reach the microphones sooner than the sounds from the central zone.
In conventional two-channel reproduction, the sound images are localized between the left and right loudspeakers. One problem often encountered with this arrangement is the so-called "hole-in-the-middle" effect, by which is meant that while the listener hears sound which appears to come from the left and right zones of the orchestra, the body thereof or the central zone appears to be vacant. To overcome this effect, it is sometimes the practice to insert a mixing bridge between the two amplifier channels feeding the loudspeakers. And while this fills in the hole, it does so at the expense of directionality.
The illusion of dimensionality obtained with conventional two-speaker stereophonic reproduction is a far cry from an authentic stereophonic experience if by stereophonic one means the creation of sound-image analagous to stereovision in which the sounds are three-dimensional and do not give a "flat" aural impression. In its existing form, two-channel reproduction affords no genuine sense of depth, for sounds emanating from the right and left speakers are intermingled in the listening environment and are heard by both ears.
Ideally, the listening post with a two-speaker arrangement should be equidistant from the left and right speakers, in which event the left and right ears and the left and right speakers are in a symmetrical pattern. Though the ears are then sensitive to the direction of the sound and there is an increased clarity of inner melodic voices, because of environmental mixing there is no genuine sensation of depth. And when the listener moves away from this ideal listening post, the directional effect is impaired.
In the article "Controlling Sound-Image Localization in Stereophonic Reproduction" Sakamao, et al.--(J. Audio Eng. Soc.--November 1981) there is disclosed a system for controlling sound image localization in all directions around a listener by inserting in the signal channels to the loudspeakers, two networks, one of which produces a ratio of the signals between the channels and the other a signal common to both channels. But this arrangement does not impart depth to the sound image.
The section "Stereophony" in the Encyclopedia of Physics--R. M. Beancon--Reinhold Publishing, describes a conventional stereophonic system having left and right speakers. It is noted in this section that "the ability of the listener to determine the apparent positions of different musical instruments is a less important part of the stereo effect". Yet what this text treats as of lesser importance, actually is the crucial feature of a true stereo effect, for when the sound-image received by the listener gives no aural impression of depth and the apparent position of the different musical instruments cannot be sensed, this is no better than a "flat" picture.
Another significant factor that existing stereo systems fail to take into account is the distinction between what a listener hears at a live concert without the intervention of electronics, and what he hears in a stereo-reproduced situation. At the live concert, the listener, who we shall assume is seated at about the center of the auditorium with the orchestra playing on stage, will hear sounds from the middle zone before hearing sounds from the left and right zones, for the distance between the listener and the middle zone is shorter. But with a stereo system in which microphones are set up at left and right sites in the same auditorium, with correspondingly-positioned loudspeakers in the listening chamber, the sonic situation is reversed, for the reproduced sounds first heard by the listener are those originating closest to the microphones, these left and right zone sounds being succeeded by those originating from other points.
The problem of electronically recreating realistic sound is aggravated by the manner in which many recordings are currently made. A small orchestra or band is usually recorded in a relatively small studio lacking in ambience. A separate microphone is assigned to each instrument, each microphone signal being recorded on a separate magnetic track. Thus a ten piece band is recorded in the studio on ten distinct tracks. In transferring this multitrack recording to a two channel tape, equalizers and mixers are used to create a recording in which the instruments are balanced and properly placed with respect to left and right of center. The net result of the multi-track recording technique is entirely synthetic in sonic terms, for despite the complexity of the electronic paraphenalia involved, there is a total loss of the essential timing, amplitude and spatial clues present in a live listening experience.