The present invention relates to a method and apparatus for establishing a substantially complete audio signal and especially to a method and apparatus for redeeming information from a discrete audio signal to reconstruct, or produce, a substantially whole, virtually omni-directional sound event.
Sound exists as pressure and velocity in a medium such as air. Sound begins with a mechanical disturbance, such as a voice, slamming door, bow across a violin string, and the like. The vibration of the sound source causes the formation or pattern of waves. The waves radiate in every direction, e.g., three dimensionally, omni-directionally, spherically. It is these moving waves that are heard as sound.
There are three commonly measured components of any sound pressure: frequency, amplitude, and phase, when a reference is available.
Since the birth of electronic audio signals the goal has been to capture, store, and reproduce an exact replica of the original sound event in such a way that the listener cannot tell the difference between the reproduction and the original.
An electronic audio signal is a fluctuating electric quantity whose variations represent all sound information as a code. We've learned how to unwrap much of the frequency and amplitude information portions from the signal code with a high amount of fidelity, enabling the wide bandwidth and broad dynamic range enjoyed today. Phase is one major component of sound that includes representing essentially all of the coupling of the spatial and temporal information elements of sound that has not been reproduced by conventional means with significant fidelity. As a consequence, conventionally reproduced audio signals to this point have been incomplete.
An ideal complete audio signal would be one in which all sound components are fully opened, transmitted, and reproduced with equal fidelity, including frequency, amplitude and phase. Such a signal would also be indistinguishable from the original sound event; e.g., radiate in all directions, three dimensionally, omni-directionally, spherically, rather than as existing incomplete signals do.
Because existing, incomplete audio signals can provide high fidelity duplication for only some components (frequency and amplitude) of sound, sound reproduction has heretofore been limited to a two dimensional perspective. Prior art methods, such as stereophonic, binaural, and various surround sound techniques, and beyond, offer signal processing enhancement methods and apparatus that are designed to compensate artificially for otherwise naturally occurring spatial and temporal information. These limitations leave the original sound event content elements locked away within the signal code: lost, hidden, buried, closed off, folded under, but nevertheless still contained inside the signal. The present invention is a method and apparatus for producing a substantially complete audio signal, not through the introduction of artificial elements, but by opening, or unfolding, the information that, until now, has been hidden within the audio signal.
There are multiple uses of the word “phase.” General use of the term phase in audio has been limited for the most part to either the idea of proper ‘phasing’ of speakers, or the term ‘absolute phase’ to describe a maker's product. Other aspects of phase that are important are monaural phase, where, typically, delayed sounds are applied to one or both ears simultaneously. Prior art shows extensive work in the area of binaural phase, which refers to a time delay due to the difference in the path length from one ear to another. But the idea of phase as a defining characteristic of sound is not generally discussed. Nor are measurements generally provided. Phase herein is concerned with the rules of hearing as a constructive process. That is, the brain takes data coming to it from the ear, and applies rules and functions to build a representation of the sound. These rules involve complicated mechanical, biological, and neurological processes that are unbelievably subtle and complex.
Phase, as it applies to the present invention, enables sound to be rendered through a signal to the ear, in a way which is substantially indistinguishable from the original acoustic event, radiating sound in a way that is similar and like that of the original captured, transmitted, or recorded sound. Transmission of the received sound waves from the ear to the brain completes the hearing process. It is believed that phase is the ‘missing link’ in the ability to recreate the listening experience with substantial accuracy. The present invention uses phase to provide a listener with a listening experience that is heard as being substantially indistinguishable from the original event.
Phase is also a relative measure of one signal against a reference signal. In acoustic events, relative phase is influenced by both time and space. This is important since in a normal listening experience, whether a single (or mono) signal is recorded, or multiple signals, such as stereo, are recorded, the recorded signals represent the phase relative to information about the recorded signals at the location of the microphone. When multiple microphones are used, the phase relative information for each recorded signal is unique to the position of the microphone relative to the source as well as the acoustics of the space in which the recording takes place. Thus, one can use multiple microphones to create a monophonic signal, by summing their outputs together, or one can record discrete signals for stereophonic or surround sound applications. In general, the path of a signal from the recording through the chosen electronics and ultimately the listening environment will be uniquely different for each signal. While a significant effort has been extended to enhance the recorded signals for listening environments, inclusive of head-related transfer functions and digital signal processing to create artificial reverberation for the illusion of a different space, it is virtually impossible to separate the listener from the acoustics of the space in which the sound is heard. However, since one can accomplish gradual cross-overs in physical space by placing multiple speakers in the room, the same way in which one can record signals with multiple microphones, it is also possible to use the original signal to extract the information contained in the recording process and introduce graduated cross-overs in the recorded signal and layer these signals together, much the way that they would be layered in the physical space, to convey a more realistic, dynamic signal.
For the purpose of describing the present invention, various terms, including phase layering, phase layered circuit, or PLC, as well as terms such as graduated crossovers are employed herein.
If any sound component is distorted from its original form, all sound components may be affected. Therefore, what affects phase, amplitude or frequency, may affect all.
Stereophonic sound is an “effect” and does not exist in nature. The stereo effect produces a ‘phantom image’ that appears as if sound is coming from somewhere in the center between two stereo speakers, when in fact, nothing is there. It is an “illusion.” The basis for defining the quality in a stereo system is how well the phantom image is able to produce a realistic “soundstage.” The soundstage takes place in what is commonly called the “sweet spot.” That is where the soundstage generated by the stereo system produces such a convincing phantom image that the listener experiences a “you are there” virtual reality. The soundstage breaks apart when the listener moves outside of the sweet spot, either too far to the left, or right, away from where the phantom image is taking place. Once outside the sweet spot, the illusion is gone. Most consumer based audio equipment in use today is based on a stereophonic sound standard.
There are several kinds of signal processors used in audio electronics. One type is designed to solve problems associated with the environment, such as a graphic equalizer, and is designed to tune a room to a flat frequency response, so that when an audio system plays, the room is not adding or subtracting from the sound. Another kind of signal processor adjusts the signal, such as a reverb system, and is designed to make fabricated recordings made in a studio sound as if they were recorded live. Audio engineers use these and other tools in their profession.
Another type of signal processor utilizes psychoacoustic techniques, based upon the study of how the brain interprets information coming to it from the ear. Many of these types of psychoacoustic signal processors have been used to help solve certain problems relative to stereophonic sound primarily, and can sometimes also be used in monophonic and discrete signal applications as well, but often as a secondary advantage.
Stereophonic sound has limitations such as the sweet spot area in which the phantom image is contained. Unlike live sound in which a large audience can share at one time, such as one might enjoy at a concert, stereophonic sound has a limited area between two speakers where the audience must gather in order to experience the phantom sound stage. This shortcoming in stereo sound has lead to various developments designed to overcome the limitations of soundstage size and either find ways to expand the sweet spot, or, as in the case of motion picture theater sound, which is the basis for home theater and surround sound, eliminate the sweet spot altogether with a different technology. Hence, one of the motivators for developing certain kinds of signal processors has been to enhance the stereophonic experience. The present invention is not limited to the sweet spot, and can be experienced in any venue, at any time, and under any listening conditions. Moreover, it works with all audio signals and signal paths—monaural, stereo, synthesized multi-channel, and discreet multi-channel, recorded and reproduced sound and transmitted sound—as all contain information which has remained hidden and buried until the present invention.
One of the rules of high fidelity is to stay faithful to the original sound event which means, “to hear the signal without alteration.” Hence, an aim for design for serious music listening, is to maintain as much signal integrity along the audio path as the state of the art allows. Hence, good audio is actually good science and there is no reason why good audio cannot and should not be applied to all audio signals. Every time an audio signal passes through any acoustical, mechanical, or electrical device distortion is created. Audio designers work to limit the amount of distortion, to maintain faithful reproduction or fidelity so that the least compromised signal becomes the highest fidelity. The substantially complete audio signal of the present invention is designed to convey significantly more of the information of the original sound event than the prior art without significantly adding anything that is not already in the signal or subtracting anything from it.
The following U.S. patents show techniques used to enhance audio sound fields primarily in stereophonic applications. There are three approaches commonly used in the past, including the application of head related transfer functions (HRTF), the use of digital signal processing to create reverberant or spatial effects to emulate a sound field other than that of the listening environment, and the use of stereophonic signals to add spatial effects. The present invention differentiates from the prior art by the method used, which can be applied to monophonic, stereophonic, or other multi-signal formats. It is not dependent upon the use of stereo signals and can improve speech intelligibility and many other aspects of all signal formats.
U.S. Pat. No. 7,203,320 to Coats, et al., teaches a sub-harmonic generator and stereo expansion processor. A method and apparatus may provide for one or more of: receiving an input signal containing frequencies from among a first range; filtering the input signal to produce a first intermediate signal containing frequencies from among a second range; producing a sub-harmonic signal from the first intermediate signal containing frequencies from among a third range, the third range of frequencies being about one octave below the second range of frequencies; canceling energy at least some frequencies from among a fourth range of frequencies from a left channel signal of the input signal to produce at least a portion of a left channel output signal; and canceling energy at some frequencies from among a fifth range of frequencies from a right channel signal of the input signal to produce at least a portion of a right channel output signal.
U.S. Pat. No. 7,003,119 to Arthur is for a matrix surround decoder/virtualizer which uses several sub-systems to generate outputs from the stereo input signal. A first sub-system synthesizes the phantom center output, which places the monaural center image between the left and right speakers in front of the listener. A second sub-system synthesizes the virtual surround (or rear) output signals, which places the sound images to the sides of the listener. A third sub-system synthesizes the left and right stereo outputs, and expands the locations of the left and right sound images.
A stereophonic spatial expansion circuit with tonal compensation and active matrixing is shown in Hoover, U.S. Pat. No. 6,947,564. In a stereophonic expansion circuit, the (L+R) sum signal is spectrally modified by increasing the bass and treble frequencies relative to the midrange so as to compensate for a midrange frequency boost in the (L−R) difference signal. The stereophonic expansion effect and manipulation of the signal parameters are produced by active matrixing amplifiers.
U.S. Pat. No. 6,711,265 to Morris is for a centralizing of a spatially expanded stereophonic audio image. A stereophonic system has sum and difference signals with expanded spatial imaging. Localization of center audio materials more towards the center is accomplished by equalization of the (L+R) sum signal. The equalization comprises decreasing the bass response while increasing the treble response of the sum signal with the desired bass reduction being accomplished by the use of a gyrator to economically synthesize an inductance. Additionally, the equalizations in the (L+R) sum signal to reduce the signal at bass frequencies and to increase the signal at treble frequencies are switchable singly or in combination between ON and “OFF” modes.
In U.S. Pat. No. 6,587,565 to Chol, a system is provided for improving a spatial effect of stereo sound or encoded sound when producing three dimensional image sound signals from signals of stereo channel. This includes a spatial effect enhancing portion where a signal for enhancing spatial effect and directivity of sound is produced, a band enhancing portion where a signal for enhancing a signal component of the stereo channel signal in a low frequency range and for maintaining the signal component in a middle frequency range is generated, and a matrix portion where the output signal of the spatial effect enhancing portion, the output signal of the band enhancing portion and the stereo channel signal are calculated in a matrix manner, so that the spatial effect of sound is improved using a differential component between left and right side channel signals. According to the patent, the spatial effect of sound can be improved without using a complicated circuit construction, the deterioration of Signal to Noise ratio is prevented, and the cost-performance ratio for realizing a spatial effect of sound is improved.
U.S. Pat. No. 6,448,846 to Schwartz is for a controlled phase-canceling circuit and system. The patent describes controlling the phase relationship between a processor's output or portions of a processor's output and the phase of the pre-processed signal in a particular frequency range or ranges, so that a controlled accentuation or enhancement of the processor's effect can be realized. In one embodiment this is achieved by providing a gain control circuit that receives and selectively amplifies the input signal prior to it being summed with the processor's output.
Australian Patent No. 708,727 to Klayman teaches a stereo enhancement system.
U.S. Pat. No. 5,761,313 to Schott is for a circuit for improving the stereo image separation of a stereo signal. By using special frequency response manipulation in the difference channel of a stereo signal, the stereo image will appear to extend beyond the actual placement of the loudspeakers. This is accomplished by shaping the difference channel response to simulate the response one would be subjected to if the sources were physically moved to the virtual positions. The circuit includes a summing and high frequency equalization circuit to which the left and right stereo signals are applied, and a difference forming and human ear equalization circuit also to which the left and right stereo signals are applied. The outputs from these circuits are cross-coupled to form left and right channel outputs.
U.S. Pat. No. 5,692,050 to Hawks is for a method and apparatus for spatially enhancing stereo and monophonic signals. A method and apparatus is disclosed that spatially enhances stereo signals without sacrificing compatibility with monophonic receivers. In accordance with one embodiment, a stereo enhancement system is implemented using only two op-amps and two capacitors and may be switched between a spatial enhancement mode and a bypass mode. In other embodiments, simplified stereo enhancement systems are realized by constructing one of the output channels as the sum of the other output channel and the input channels. In other embodiments, a pseudo-stereo signal is synthesized and spatially enhanced according to stereo speaker crosstalk cancellation principles. In yet other embodiments, the respective spatial enhancements of monophonic signals and stereo signals are integrally combined into a single system capable of blending, in a continuous manner, the enhancement effects of both.
U.S. Pat. No. 4,959,859 to Kennedy et al. is for an FM channel separation adjustment system.
The conventional definition of an anti-phase signal is one that has inverted phase (180 degrees) as summarized in U.S. Pat. No. 6,477,255.
U.S. Pat. No. 4,866,774 to Klayman is a stereo enhancement and directivity servo. In a stereo system having sum and difference signals that are processed for stereo image enhancement, apparent directivity of the stereo sound is increased by the use of servo systems for the left and right processed difference signals (L−R)p, (R−L)p. Each of the left and right servos responds to the respective left or right stereo input signal (L−in, R−in) and amplifies increases in the respective left or right processed difference signals. The amount of amplification is controlled by feeding back the amplified or directivity enhanced difference signal (L−R)pe, (R−L)pe, first comparing it with the processed difference signal (L−R)p, (R−L)p before directivity enhancement, and then combining it with the input signal (Lin, Rin) in a preselected ratio so as to control the amount of amplification of the processed difference signal that is provided for directivity enhancement.
U.S. Pat. No. 4,815,133 to Hibino teaches a sound field producing apparatus connected to a stereophonic sound source to supply audio signals to a loudspeaker system that has an indirect sound extracting circuit for extracting indirect sound components by extracting a difference signal between right and left input signals. The difference signal is phase inverted to obtain an inverted difference signal. Each of two mixing circuits receives the right input signal, the left input signal, the left and right difference signal and the inverted difference signal to produce a left and right output.
U.S. Pat. No. 4,218,585 to Carver is for a dimensional sound producing apparatus and method for stereo systems. The right signal in addition to driving the right speaker, is inverted and delayed and transmitted to the left speaker. The left signal in addition to driving the left speaker is inverted and delayed and transmitted to the right speaker.
U.S. Pat. No. 3,725,586 to Iida is for a multi-sound reproducing apparatus for deriving four sound signals from two sound sources. Left and right sound signals applied to two input circuits are each shifted in phase by phase shifters and then supplied to separate output circuits. The left sound signal is also fed through a low pass filter to be combined with the phase shifted right sound signal and the combined signal supplied to a separate output circuit. Likewise the right sound signal is fed through a low pass filter to be combined with a the phase shifted left sound signal and this combined signal supplied to a separate output circuit.
The method and apparatus of the present invention reproduces a substantially complete audio signal that utilizes a substantial amount of the sound information contained in an audio signals code with improved fidelity and integrity to the original sound source.
In addition to providing a substantially complete audio signal at any link of the audio chain from capture, transmission or storage, to the reproduction of the signal, the present invention also provides a way to reconstruct a substantially complete audio signal from the code contained within an existing, (incomplete) audio signal. The principles of the present invention may be applied to any known signal type, whether single, mono, or discrete, or multiple signals, such as stereo signals in known audio signal applications, from live transmitted sound, such as by telephone, radio broadcast, live sound reinforcement, or by the reproduced sound from a recording, such as from a CD or MP3 player, phonograph, DVD or Blu-Ray player.
Additionally, the present invention also may provide improved intelligibility for speech and dialog, particularly advantageous in telecommunications, motion pictures, and other applications, such as military, law enforcement, medical, and other emergency sound applications. Also, improved clarity, higher resolution, better dynamics, truer tone, broader, bigger, wider space, more precise dynamics, more natural spectral balance, and greater detail, are some of the natural byproducts of presenting the whole, open, original sound components through a complete audio signal.