In order to enable a better understanding of the invention and the differences between the invention and known systems, a brief description of various known sound reproduction techniques will first be given, as follows:
BINAURAL SOUND-In this reproduction technique, sound is recorded with two microphones positioned to simulate the positions of ears of a human head, to thereby produce a plurality of signals. In order to preserve the binaural effect, during sound reproduction, the listener must wear a set of earphones that are spaced apart the same distance as the recording microphones. Both the amplitude and phase of the sound produced by the earphones are identical to the sound received by the recording microphones. This technique requires a closed circuit system and has the disadvantage that the listener must wear earphones.
MONAURAL SOUND-This sound reproduction technique is also a closed circuit technique, and is similar to the Binaural technique except that it uses only one recording channel. This technique is exemplified by conventional telephone systems.
MONOPHONIC SOUND-As in the case of Monaural sound, only one sound channel is provided in this technique. The system is not a closed system, however, and the reproduction device, however, is in the form of one or more loudspeakers, each of which is energized to emit sound corresponding to the signals on the single channel.
STEREOPHONIC SOUND-This technique employs two (or more) channels, corresponding to sound received directly by microphones at two (or more) spaced apart locations. The optimal stereo recording arrays are known as "ORTF" miking, coincidence miking, near coincidence miking, spaced miking, "SASS" miking and "AMBIPHONIC" miking.
By recording with these techniques, we can "capture" the sounds being recorded in a fashion that better approximates how we hear sounds, while keeping enough differentiating and complementary information.
Another trend of the recording industry, with what is known as a multi (mono) miking and multi (mono) track recording process, is to artificially "locate" the sound of different instruments and sampled sounds by using "panoramic positioners" on the mixers used to feed a two track recording unit. The recording industry calls this a stereo technique, when in effect it should be distinguished as multi track directed mono recordings.
For optimum reproduction, the signals energize separate loudspeakers located at spaced geometrical positions ideally corresponding to the locations of the respective recording microphones' pickup arrays. In this technique, as well as in monophonic sound techniques, the acoustics of the recording location and reproducing location both influence the sound that the user hears, with the result that the sound that is heard even if it should be ideally the same, is not the same as the sound originating from the recorded sound source. Typically about 90% of the sound that is heard in the environment in which the recording was made is reflected sound.
Due to these reflections, the direct musical waves (approximately 10%) give the precision of localization of the origins of the sound of the instruments (e.g. flutes, violins and percussion) while the reflected sounds (approximately 90%) give the ambience of the hall, the depth perception of the soundstage and the richness of the musical experience. The musical emotional experience that a listener, who is in the environment in which the recording was done, has, is due to a complex combination of these reflections of musical information. These are what allow the listener to perceive his environment.
Now remember that the goal of high fidelity is to recreate the musical experience of being present at a concert (regardless of the type of music; jazz, classical, blues, etc.). The only way to accomplish this is to render all the possible sonic information to the brain by the tools that are our ears via a sound reproduction system and also, by consequence, via the most appropriately capable recording processes and techniques.
Assuming that the recordings to be reproduced are capable of "capturing" all the information to be reproduced, and that the more the sound reproduction system is neutral, realistically dynamic and capable of rendering proper transience (including the loudspeakers), the better it should be able to give the basic information that is essential to determine the spacial localization, i.e. the depth and width of the sound stage and even its height. (Our ears/brain combination is indeed capable of indicating if a sound comes from above or below and also what height it originates from. However, it would be trivial to continue at this point on this subject, since it is not relevant to the ability of the present invention to allow the listeners to perceive the information required to locate and hear sounds on the height plane as well.)
What exactly is the effect produced by the sum of this vital information? With his eyes closed, the listener who is relaxed and attentive should see himself "brought" to the location of the recording.
The problem is that both loudspeakers send (ideally with a great neutrality and quality) some complementary information in a less than complementary fashion.
The information sent by each loudspeaker has to be interdependent, at least from a listener's point of view, in order to recreate the spacial coherence and a realistic musical experience.
One type of conventional stereophonic system employs two positioned identical loudspeakers that are energized to provide sound pressure and phase along the plane of symmetry between the two speakers that is the same as at the location of the microphones that were used to record the signal. The plane of symmetry is the central plane that is perpendicular to the line joining the two speakers. In such systems, when the user is not located at the plane of symmetry, the fundamental information is out of phase since the listener is not at a position that is equidistant from the two speakers, and the stereophonic effect is thereby absent.
The fundamental shortcomings of the current trends are the modification of the sounds and the vital micro-information (which are in the harmonic domain) because of the reflection of these on the walls, ceiling, furniture and other objects before their arrival to the ears of the listener. Also, these loudspeakers send the fundamental information out of phase one relative to the other, because the listener is practically never equidistant to the speakers and also because of the reflections of sonic information on all the elements that are in his or her environment.
The result can be a beautiful sound, yes, but not yet recreating, unfortunately, in any way the musical experience perceived by this same listener as if he is situated at the recording location. The goal of high fidelity is therefore not yet achieved.
A good analogy is as follows: the colors are nicely distorted and over and/or undersaturated (depending on the observing point of view and the sampled part of a broad color spectrum) and the image is grossly out of focus and proportion.
Here, we must understand that the human ear locates the point of origin of the sounds that it receives, thanks to the stereophonic perception of our two ears combined together. A sound generated from point "x" (see FIG. 1A) will be perceived simultaneously by the two ears of listener "y". If the point "x" is located right ahead of him, the brain will not register a difference in time perception between the right ear and the left ear because the sound arrives at the same time to both ears. Thanks to this the brain knows that the sound comes from ahead. For sounds coming from the back, there is a difference of perception that the brain is capable of noticing. This difference is due mostly to the shape of the ears that reflect, in a complex manner, the sounds before sending them in towards the tympanies. If, on the other hand, the point "x" is situated on a radius of two o'clock relative to the positioning of "y" (see FIG. 1B) the sound that hits the left ear arrives with a slight delay compared with the sound that the right ear perceives. This is due to the fact that sound travels at about 345 meters per second. The delay is only a few milliseconds. Yet this is enough for the brain to notice the difference and after a fast, automatic subconscious calculation it can determine from where the sound comes. All this is done and noticed thanks to the relative difference in arrival times of the sounds perceived by the right and left ears.
(There are other factors involved in our sonic spacial perceptual abilities. They are related to the "pitch" ("Doppler effect" domain) and "timbre" domain as well as the amplitude domain.)
In conventional stereophonic systems, the left and right reproducing speakers are energized to produce sound waves of the relative same phase as the sound waves recorded by the left and right recording microphones, respectively, in order for the sound produced at the plane of symmetry to duplicate the recorded sound. Application of the signals to the reproducing speakers with phases that are off axis, relative to the phases of the original signals, will not fully simulate the original sound, and hence will not result in faithful reproduction of the recorded sound, even assuming the absence of reflections at the reproduction site.
An example of a conventional system of this type is illustrated in FIG. 1, wherein left and right speakers 10, 11 are spaced apart a distance that preferably represents the distance between the microphones employed to originally record the sound. The speakers 10, 11 are oriented with their major axes parallel to one another, and are energized by the left and right output signals of a conventional stereo amplifier 12. The line 13 in this figure is perpendicular to and centrally intersects a line extending directly between the tips of the cones of the speakers, the line 13 thereby simulates the plane of symmetry of the two speakers. Since every point on the line 13 is equidistant from the two speakers, the temporal relationship of the direct sounds from the two speakers at that point simulates the temporal (as well as the amplitude) relationship of the sound as received by microphones employed to originally record the sound. This temporal relationship is lost, however, at points displaced from the line 13, the divergence from the relationship increasing as the distance from the line 13 increases. It should also be reiterated that in terms of micro-information, there are practically no "sweet spots". This translates into a major shortcoming of currently accepted stereo reproduction/perception and other compromising notions. In systems of this type, the axes of the speakers may alternatively be directed at equal acute angles to the line 13, but such orientation does not generally affect the temporal relationships between sound as above discussed.
In the past, speakers have been positioned at locations that did not simulate the geometry of the recording microphones. For example, U.S. Pat. No. 4,673,057 discloses a system having an assemblage of speakers arranged on each of the faces of a polyhedron, to emit sound in a direction perpendicular to the respective faces, with speakers on one side of an equatorial plane of the polyhedron being energized with the right stereophonic signals and all of the speakers on the other side of the equatorial plane being energized with the left stereophonic signals. The sound pattern produced by such a large number of speakers is very complex, and due to the physical size of the polyhedron, the sound emitted from the opposite sides of the polyhedron simulates sound from a plurality of spaced apart sources. The phase and timing of the sound generated by the speakers hence is quite different than the sound received by the recording microphones.
In one embodiment of the present invention, a sound reproduction system is provided that employs a pair of identical speakers that are mounted "back-to-back". Such a physical arrangement of loudspeakers has been disclosed, for example in U.S. Pat. Nos. 4,268,719 and 4,585,090, only for monophonic systems. U.S. Pat. No. 4,016,953 discloses a system employing a pair of speakers directed toward one another, and energized with identical signals of opposite polarity, in order to provide a push-pull effect for monophonic signals.