This invention pertains to a method and apparatus for reproducing sound from stereophonic source signals in which the reproduced sound has a realistic ambient field and acoustic image.
The present invention can best be understood and appreciated by setting forth a generalized discussion of the manner in which stereophonic signals originate, as well as a generalized discussion of the manner in which sound is conventionally reproduced from a stereophonic signal source.
When live music is, for example, performed the listener perceives both the sonic qualities of the instruments and the performers and also the sonic qualities of the acoustic environment in which the music is performed. Normal stereophonic recording and reproducing techniques retain much of the former, but most of the latter is lost.
The human auditory system localizes position through two mechanisms. Direction is perceived due to an interaural time delay or phase shift. Distance is perceived due to the time delay between an initial sound and a similar reflected sound. A third, poorly understood mechanism, causes the ear to perceive only the first of two similar sounds when separated by a very short delay. This is called the precedence effect. Through these mechanisms the listener perceives the direct sound reflected from the walls of the hall. Due to the direction and distance information contained in the reflected signals the listener forms a subliminal impression of the size and shape of the hall in which the performance is taking place. Referring to FIG. 1, for example there is illustrated a source S spaced from a listener P in an environment which includes a plurality of walls, W1, W2, and W3. In such an environment the listener will of course perceive sounds from the source S along a direct path DP1. Also, the listener will perceive sounds reflected from the walls of the environment, illustrated in FIG. 1 by the path RP1 to a point P1 on the wall W1 and thence along path RP2 to the listener P. In stereophonic recording, microphones ML and MR are situated in front of the source S as shown in FIG. 1. If the source S is equidistant from the microphones, then both microphones will pick up sounds from the source S along direct paths DP2 and DP3. In addition, the hall ambience information will be recorded by the left and right microphones ML and MR in addition to the direct sound from the source. This is illustrated by the reflected paths RP3 and RP4 from the point P1 on wall W1.
Turning now to FIG. 2, there is illustrated what happens when the sounds recorded by the microphones as in FIG. 1 are reproduced by loudspeakers LS and RS positioned in the same position relative to the listener P as the recording microphones. In FIG. 2 the listener P is shown as having a left ear Le and a right ear Re. If the sound recorded as in FIG. 1 was initially equidistant from the two microphones, the sound will reach each microphone at the same time. Accordingly, in reproducing the sound, a listener equidistant from the two speakers LS and RS will hear the reproduced direct sound from the left speaker in the left ear (path A) at the same time as the same sound from the right speaker is heard in the right ear (path B). The precedence effect will tend to reduce perception of interaural crosstalk paths a and b. The listener P, hearing the same sound in both ears at once will localize the sound as being directly in front of and between the speakers, as shown in FIG. 3.
Referring again for a moment to FIG. 1, consider a sound reflected from the point P1 on the wall W1 of the hall. The reflected sound from the secondary source reaches the left microphone ML first via the path RP3. This sound is delayed relative to the direct sound along path DP2, partially preserving the distance information about the reflection from P1. The sound from P1 at some time thereafter reaches the right microphone MR along path RP4 after a further delay and further reduction in loudness. In this case, the delay corresponds approximately to the distance MD between the microphones. Turning now to FIG. 4, there is illustrated what the listener P will hear with respect to both the direct and reflected sound illustrated in FIG. 1. When reproduced by the loudspeakers LS and RS the listener will first hear the direct sound from the source at the same time in both ears, corresponding to the apparent source shown in FIG. 4. The listener will then hear the delayed sound corresponding to the reflection from P1 being recorded by the left microphone and reproduced by the left speaker first in the left ear Le and then in the right ear Re. The initial delay caused by the longer path taken by the reflection in reaching the left microphone ML gives the listener an impression of the distance between the original source, P1, and himself. However, the interaural delay t, (corresponding to the time it takes sound to travel between a listener's ears) gives the impression that the reflected sound has come from a point behind and in the same direction as the left speaker, illustrated as the first apparent point P1 in FIG. 4. For reference, the location of the actual point P1 is also in FIG. 4. After a further delay, the listener will hear the reflected sound reproduced by the right speaker RS. Since the additional delay (corresponding to the distance MD in FIG. 1) is much greater than any possible interaural delay (except for the case of a very small microphone spacing) this sound will create a second apparent point P1 behind and in the same direction as the right speaker, as illustrated in FIG. 4. However, it has been observed in experiments that the listener mainly perceives the direction information of the first apparent point source P1, largely ignoring the second. Thus the listener perceives the sound as coming primarily from the direction of the left speaker or slightly inside the left speaker if the loudness of the sound apparent point source P1 is significant compared to the first. This analysis describes the effect on any other sound sources recorded by the two microphones such that the difference in arrival times at the two microphones is greater than the maximum possible interaural time delay.
Referring to FIG. 5, for some reflected sounds the path lengths to the two microphones ML and MR will be such that the differences in arrival times of the reflected sound at the two microphones will be comparable to a possible value of interaural time delay. Thus, the reflected sound from point P2 to the left microphone ML along path d' would be approximately equal to the path length c' to the right microphone MR plus the interaural time delay .DELTA.t. Thus, assume that d' equals c'+.DELTA.t. When this occurs, the arrival of the reproduced sound from the two speakers at the corresponding ears at slightly different times will have the same effect as an interaural time delay giving the listener a definite impression of the direction and distance of the reflected sound. Referring to FIG. 6, as there illustrated each possible value of interaural time delay corresponds to an angle of incidence for the perceived sound within a 180.degree. arc. As the difference in arrival times at the mirophones approaches the maximum possible value of the interaural delay, the apparent direction of the sound would swing rapidly to the right or left. In practice this is limited by the listening angle of the loudspeakers. When the time difference of the sounds arriving at the respective ears approaches the interaural delay corresponding to the listening angle of the speakers, the interaural crosstalk signal of the opposite speaker gradually takes precedence effectively limiting the apparent sound sources to within the listening angle of the speaker.
It should be apparent at this point that all sound sources, ambient or otherwise, whose signals arrive at the respective microphones with a time difference greater than the interaural time delay corresponding to the listening angle of the reproducing speakers will appear to the listener as apparent sources behind and in the same general direction as one of the speakers as shown in FIG. 4. The delayed signal appearing in the other channel, being lower in loudness, will have only slight effect in drawing the apparent source inside the speakers. This has been confirmed by experiments which show that, in fact, the apparent sound source remains substantially within the listening angle defined by the speakers.
The existence of interaural crosstalk has long been known and discussed at some length in the literature. Additionally, there are several recent patents which have disclosed methods and techniques for eliminating interaural crosstalk, without however making a complete analysis of the consequences of so doing.
One such prior art patent is U.S. Pat. No. 4,058,675 to Kobayashi et al. This patent discloses a means for cancelling interaural crosstalk using inverted and delayed versions of the left and right stereo signals fed to a second pair of speakers arranged to produce the correct geometry. As explained in U.S. Pat. No. 4,218,585 to Carver, the Kobayashi et al device is only partially effective. Carver discloses in U.S. Pat. No. 4,218,585 an electronic device for cancelling interaural crosstalk. This device inverts one stereo signal, splits it into several components, delays each component separately by a different amount and recombines these with a modified version of the other stereo signal. Performing this operation on both stereo signals, Carver claims to effect a cancellation of interaural crosstalk and to create a "dimensionalized effect."
U.S. Pat. No. 4,199,658 to Iwahara also discloses a technique for performing the interaural crosstalk cancellation. Iwahara uses a second pair of speakers to reproduce the cancellation signal, which is composed of a frequency and phase compensated version of the inverted main signal. This cancellation signal is fed to a speaker just outside the main speaker on the opposite side from which the cancellation signal was derived. The necessary delay is accomplished acoustically by the placement of the sub-speakers and detailed consideration is given to the phase and frequency compensation required to accomplish the cancellation. Additionally, a binaural signal input is specified. It will be seen later why a binaural input is essential to the correct function of an interaural crosstalk cancellation system.
Assuming that a method or technique is successful in cancelling the interaural crosstalk, it should be examined what effect this would have on the listener's perception of the reproduced sound. Referring to FIG. 2, if the interaural crosstalk cancellation were successful, paths a and b to the opposite ears would be eliminated. This would help the localization of sources equidistant from the recording microphones (FIGS. 1 and 3). As the sources moved off-center, however, the difference in arrival times at the two microphones increases corresponding to larger values of interaural time delay and hence greater angles of incidence as illustrated in FIG. 6. Since the crosstalk paths from the speakers have been cancelled out, the speakers give no directional information about themselves. The perceived direction of the apparent sound source will depend only on the difference in arrival times of the signal at the two recording microphones and to a much lesser degree the relative loudness. FIG. 7, for example, shows an off axis source whose signal arrives at the right microphone .DELTA.t later than at the left microphone. In this example .DELTA.t is equal to the maximum possible interaural time delay. When reproduced, with crosstalk cancelled, the right channel signal will arrive at the right ear .DELTA.t later than the left signal at the left ear. FIG. 8 shows the apparent source displaced far to the left of the listener, which it would appear to the listener in such a circumstance.
It should be clear that for microphones spaced far apart only a small displacement off the equidistant axis will be required to create an arrival time difference at the microphone equal to the maximum possible interaural time delay. This will result in a rather dramatic expansion of a small portion of the center of the stereo stage. For sound sources further displaced and coresponding to time delays greater than the maximum possible interaural time delay, which will include most of the ambience information, the listener will have difficulty localizing any apparent source. In effect, the listener will be forced to perceive sounds as if he had ears placed at the recording microphone spacing and may perceive apparent sound sources within his own head when the microphone spacing is large. An accurate prediction of the effects of this situation is beyond the current state of the art of psychoacoustics and beyond the scope of this discussion. It is precisely because of this potential difficulty that the U.S. Pat. No. 4,199,658 to Iwahara specifies a binaural signal input. That is to say, that the recording has been made with a microphone spacing equal to the ear spacing. However, recordings made in this manner are extremely rare. It is also possible that the problem outlined above accounts for the unspecified "dimensionalized effect" referred to by Carver in U.S. Pat. No. 4,218,585. Use of any of the above-mentioned crosstalk cancellation systems with commonly available recordings might well result in the effect described by Carver:
"The overall effect of this is a rather startling creation of the impression that the sound is `totally dimensionalized`, in that the hearer somehow appears to be `within the sound` or in some manner surrounded by the various sources of the sound." (U.S. Pat. No. 4,218,585, column 9, lines 35-39)."
Although this effect that Carver describes may be an interesting aural effect, it is not believed to give a realistic impression of the original performance, particularly in the reproduction of ambience information which constitutes the majority of far-off axis signals.