The acoustic field generated by conventional loudspeaker is not directional especially for low frequency signals. Directional radiation at medium and low frequencies is only possible by using an array of loudspeakers having complex control mechanisms, and the resulting system has a high cost.
However, it is well known that a highly directional ultrasonic beam can be generated relatively easily. It is further known to modulate an ultrasonic wave such that it contains two ultrasonic frequency components differing by an audio frequency, and transmit the modulated ultrasonic wave into air as a narrow beam. Nonlinear effects of the air cause the two component signals to interact and a new signal with a frequency corresponding to the difference of the two frequencies is generated. Thus, the nonlinear effects of air will automatically demodulate the ultrasonic signal and reproduce the audio signal in a narrow region of air [1]-[5]. This highly directional audio space is called an audio beam.
This is a very promising technology with a very wide range of possible applications. However, because the demodulating process is nonlinear, the reproduced audio signal is highly distorted unless there is appropriate pre-processing. Several kinds of pre-processing are suggested [4], [6], [8] and [9].
The overall structure of these systems is shown in FIG. 1. An audio signal is input from the left of the figure to a pre-processing unit 1. The output of the pre-processing unit 1 is transmitted to a modulation and power amplification unit 2, as is an ultrasonic wave generated by an oscillator 3. The modulation and power amplification unit 2 uses the output of the pre-processing unit 1 to modulate the ultrasonic wave, and the resultant ultrasonic wave is transmitted to an ultrasonic transducer 4, which generates a directional ultrasonic beam 5, which is demodulated by air to regenerate the audio sound.
Such a system typically suffers from two forms of distortion. Firstly, the frequency response is not uniform. In particular there is a −12 dB/octave decrease in sound pressure level (SPL) toward the low frequency end. Secondly, the demodulating process will generate many (distortion) frequency components that are not included in the original audio signal. For simplicity, we refer to these extra signals in this document as total harmonic distortion (THD) (although this is not the exact definition of THD used in acoustics). The pre-processing methods so far suggested attempt to overcome mainly the second problem. However, they are neither efficient nor easy to implement in practice.
To explain why this is so, we turn to a mathematical discussion of the situation. Based on the nonlinear theory of acoustics, it is shown in [5] that if two collimated primary waves with frequencies f1 and f2 respectively are transmitted from a piston radiator, due to the non-linearity of the air, the reproduced difference frequency signal (secondary wave) is:
                                                        q              -                        ⁡                          (                              r                ,                z                            )                                =                                    -                                                j                  ⁢                                                                          ⁢                                      p                                          0                      ⁢                      a                                                        ⁢                                      p                                          0                      ⁢                      b                                                        ⁢                  β                  ⁢                                                                          ⁢                                      k                    -                    2                                    ⁢                                      a                    2                                                                    4                  ⁢                                      ρ                    0                                    ⁢                                      c                    0                    2                                    ⁢                                      α                    T                                                                        ⁢                                          ⅇ                                  -                  α_z                                            z                        ⁢                                          D                W                            ⁡                              (                θ                )                                      ⁢                                          D                A                            ⁡                              (                θ                )                                      ⁢                          exp              ⁡                              (                                                      -                                          1                      2                                                        ⁢                  j                  ⁢                                                                          ⁢                                      k                    -                                    ⁢                  z                  ⁢                                                                          ⁢                                      tan                    2                                    ⁢                  θ                                )                                                    ,                            (        1        )            where q_(r, z) is the complex-valued amplitude of the difference frequency signal, z is the coordinate along the axis of the beam, r is the transverse coordinate, p0a and p0b are the initial SPLs of the two primary frequency waves of a piston radiator with radius a, k_ is the wave number of difference frequency f1−f2 (assuming f1>f2), β is the coefficient of nonlinearity, ρ0 is the ambient density of the medium, c0 is the small-signal wave propagation speed,
            D      W        ⁡          (      θ      )        =      1          1      +                        j          ⁡                      (                                                            κ                  -                                /                2                            ⁢                              α                T                                      )                          ⁢                  tan          2                ⁢        θ            
It has been further shown by Berktay that under certain simplifying assumptions [5], if a DSB AM primary wave p1(t) is transmitted, after the air demodulation, at the far field of the transducer and on the z axis of the beam, a secondary wave p2(t) will be generated:p1(t)=P0E(t)sin(ωct)  (2)
                                          p            2                    ⁡                      (            t            )                          =                                            β              ⁢                                                          ⁢                              P                0                2                            ⁢              A                                      16              ⁢                              πρ                0                            ⁢                              c                0                4                            ⁢              z              ⁢                                                          ⁢              α                                ⁢                                    ∂              2                                      ∂                              τ                2                                              ⁢                                    E              2                        ⁡                          (              τ              )                                                          (        3        )            where P0 is the SPL of primary wave, E(t) is the modulation envelope, ωc is the angular frequency of carrier wave, A is the transducer's cross sectional area, α is the absorption coefficient of the medium (at ωc), and τ=t−z/c0 is the lag time. The relationship between the modulation envelope E(t) and the audio signal a(t) is:E(t)=1+ma(t)  (4)where m is the AM index. Based on Eqn. (3), it is found that the demodulated signal is not linearly proportional to the envelope of the modulation. To reproduce the audio signal with high fidelity, an equalization of the audio signal a(t) is required to compensate the square operation on E(t). This means that by appropriately pre-processing a(t) before AM, the secondary wave should be directly proportional to a(t). This can be achieved by generating a modified version {tilde over (E)}(t) of E(t) as [4], [6]:
                                          E            ~                    ⁡                      (            t            )                          =                              [                          1              +                              m                ⁢                                  ∫                                      ∫                                                                  a                        ⁡                                                  (                          t                          )                                                                    ⁢                                              ⅆ                                                  t                          2                                                                                                                                          ]                                1            /            2                                              (        5        )            
This seemingly simple pre-processing is very difficult to implement in practice. The main difficult arises from the square-root operation. Because it is a nonlinear operation, it will increase the signal bandwidth vastly. This poses a very strict requirement to the bandwidth of the circuit and ultrasonic transducer. Especially for ultrasonic transducer, it is very difficult to make a wideband and high power-efficiency transducer. The double integration is also difficult to implement due to the −12 dB/octave amplitude weighting effect and also to the large frequency span (20˜20,000 Hz, 10 octaves) of audio signal. Also, analog integrator is easy to saturate and difficult to debug in practice.
In summary, the simple square-root pre-processing used to compensate the distortion will not work well in practice because of the following reasons: 1) a practical transducer has a limited bandwidth which is usually not enough to transmit all the frequency components required by square-root operation, especially for high audio frequency (e.g. f>5 kHz). 2) the practical transducer frequency response is not uniform even within its pass band. This will result in the harmonic components of one single tone signal being generated with an amplitude and phase different from those required by the square-root operation. 3) a wideband transducer generally has low efficiency compared with a narrow band one since it does not work near the resonant frequency point. 4) the Berktay formula (3) is only an approximation that is valid under far-field and on-axis conditions, while some of the interesting working areas in practice are within the near field and off-axis, and 5) in practice, if the modulating part of the signal is small, the square rooted waveform {tilde over (E)}(t) is very similar to the waveform without the square-root operation E(t). Thus, the effect of square-root operation is actually not so evident as it seems to be.
To reduce the THD of multiple frequency signals while in the same time to avoid the wideband requirement of the square-root pre-processing method, [8] and [9] proposed a way to use an iterative process to approximate the square-root envelope by SSB modulation. This is still based on the idea that a square-rooted envelope will generate lower THD. While true square-root DSB AM will require a very large bandwidth, the SSB AM based approximation will avoid such requirement. However, since the real feedback of the demodulated signal is not available, a model is used there to simulate the demodulating process in the air. What is suggested for the model is still based on Berktay's equation (3). However, as noted, (3) is only valid under certain conditions and cannot be used as a general description of the secondary wave field. The real performance of the method is doubtful. Also, the iterative process is complex and requires a large amount of computation. Thus, it is not suitable for real time implementation.
Both of the above two methods are in somewhat similar to the active noise cancellation technique in a large open space. They all add to the original signal with extra frequency components in advance. If the phase and amplitude of these extra components can be accurately controlled, they will cancel the other extra components generated later during the demodulating process. Good matches in both amplitude and phase among these components are needed. In practice, due to the non-uniform response of the circuit and transducer, it is very difficult to implement them over a wide frequency range.