The present invention is related to audio technology and, in particularly, to the field of sound focusing for the purpose of generating sound focusing locations in a sound reproduction zone at a specified position such as a position of a human head or human ears.
When taking a look at the whole field of acoustics, the term “sound focusing” is referred in context to very different applications. Underwater acoustic communication, ultrasonic medical diagnostics, non-invasive lithotripsy, non-destructive material testing are only a handful of possible use cases.
From the view of audio reproduction, focusing is an attractive method for generating outstanding perceivable effects. On the one hand sound focusing provides possibilities for creating virtual acoustic reality, for example for holophonic audio reproduction methods. On the other hand there is high potential for facilitating spatially selective audio reproduction which opens the door to individual or personal audio which is a focus of the present invention.
Personal sound zones can be used in many applications. One application is, for example, that a user sits in front of her or his television set, and sound zones are generated, in which sound energy is focused, and which are placed in the position, where the head of the user is expected to be placed when the user sits in front of the TV. This means that in all other places, the sound energy is reduced, and other persons in the room are not at all disturbed by the sound generated by the speaker setup or are disturbed only to a lesser degree compared to a straightforward setup, in which sound focusing is not performed to take place at a specified sound focusing location.
Other useful applications are public information facilities, in which a sound zone can be generated in front of a public announcement facility so that only persons being in front or in the specified position of the announcing facility can understand the information from the facility and other persons which are not positioned in the sound focusing zones cannot understand the announced information.
Other applications are privacy applications without headphones. In a very good sound focusing application, a user can receive his or her personal information by straightforward loudspeakers, but only the user will understand the information and other persons in the room will not understand the information, since they are not in the sound focusing zones.
Further applications are in the field of entertainment. Specifically, users are interested to watch the movie on a small display such as a laptop display or even a mobile phone or mobile player display, and the user is interested to place the device in front of the user, for example on the table. Sound focusing allows that the sound is concentrated where the user is located which means that even with smaller speakers, nevertheless satisfying volumes can be generated around the user's ears. Furthermore, even when the user is using a mobile phone in a straightforward way, the sound focusing directed to an expected placement of the ear of the user will allow to use smaller speakers or to use less power for exciting the speakers so that, altogether, battery power can be saved due to the fact that the sound energy is not radiated in a large zone but is concentrated in a specific sound focusing location within a larger sound reproduction zone. Naturally, more loudspeakers consume more power, but the concentration of power at a focusing zone necessitates less battery power compared to a non-focused radiation using the same number of speakers.
Sound focusing even allows to place different information of different locations within a sound reproduction zone. Exemplarily, a left channel of a stereo signal can be concentrated around the left ear of the person and a right channel of a stereo signal can be concentrated around the right ear of the person.
Furthermore, completely different information can be reproduced within a sound reproduction zone at spatially different locations by using the same loudspeaker setup, where only a small or even no crosstalk between these sounds can be realized.
There exist several sound focusing applications. One sound focusing application is a numerical calculation of an inverse filter using a ME-LMS-optimization. (ME-LMS=multiple error least mean square). The ME-LMS algorithm is used as a method for inverting a matrix occurring in the calculation. An arrangement consisting of N transmitters (loudspeakers) and M receivers (microphones) can be represented in a mathematical way using a system of linear equations having a size M×N. When the positions of the speakers and microphones are known, the unique relation between the input and the output can be found by calculating a solution of the wave equation in a respective coordinate system such as the Cartesian coordinate system. By providing a desired solution such as sound pressure at (virtual) microphone positions it is possible, to calculate the input signals into loudspeakers, which are derived from an original audio signal by respective filters for the loudspeakers.
The calculation of the solution of such a multi-dimensional linear system of equations can be performed using optimization methods. The multiple element least mean square method is a useful method which, however, has a bad convergence behavior, and the convergence behavior heavily depends on the starting conditions or starting values for the filters.
The time-reversal process is based on a time reciprocity of the acoustical sound propagation in a certain medium. In such a situation, the sound propagation from a transmitter to a receiver is reversible. If sound is transmitted from a certain point and if this sound is recorded at a border of the bounding volume, sound sources on the volume can reproduce the signal in a time-reversed manner. This will result in the focusing of sound energy to the original transmitter position.
Time-reversal mirror (TRM) generates sound focusing in a single point. The target is to have a focus point which is as small as possible and which is, in a medical application, directly located on for example a kidney stone so that this kidney stone can be broken by applying a large amount of sound to the kidney stone.
Other effects are the model-based control of a loudspeaker array. One model-based approach is beam forming. Particularly, beam forming means the intended change of a directional characteristic of a transmitter or receiver group. The coefficients/filters for these groups can be calculated based on a model. The directed radiation of a loudspeaker array can be obtained by a suitable manipulation of the radiated signal individually for each loudspeaker. By using loudspeaker specific digital coefficients which may include a signal delay and/or a signal scaling, the directivity is controllable within certain limits. One can create the focus zone, when the signal propagation delay between loud speakers and the intended focus zone is inverted and when this inverted signal delay is used as loudspeaker-specific signal delay of the audio signal for each loudspeaker channel. This distribution of delay coefficients and the choice of the loudspeaker-specific signal values or, stated in general, the choice of the loudspeaker-specific transfer functions influences the focus zone.
Other model-based methods are wave field synthesis or binaural sky. Model-based is related to the way of generating the filters or coefficients for wave field synthesis or binaural sky. By performing a loudspeaker-specific signal manipulation, the radiated signal is manipulated in such a way that the superposition of wave field contributions of all loudspeakers results in an approximated image of the sound field to be synthesized. This wave field allows a positionally correct detection of a synthesized sound source in certain limits. In the case of so-called focused sources, one will perceive a significant signal level increase close to the position of a focused source compared to an environment of the source at a position not so close to the focus location. Model-based wave field synthesis applications are based on an object-oriented controlled synthesis of the wave field using digital filtering including calculating delays and scalings for individual loudspeakers.
Binaural sky uses focused sources which are placed in front of the ears of the listener based on a system detecting the position of the listener. Beam forming methods and focused wave field synthesis sources can be performed using certain loudspeaker setups, whereby a plurality of focus zones can be generated so that signal or multi-channel rendering is obtainable. Model-based methods are advantageous with respect to calculation resources, and these methods are not necessarily based on measurements.
The publication “Time-reversal of ultrasonic fields—Part I: basic principles”, M. Fink, IEEE transactions on ultrasonic, ferroelectric, and frequency control, Vol. 39, #5 Sep. 1992 discusses the time-reversal focusing technique in detail.
The technical publication “The binaural sky: A virtual headphone for binaural room synthesis” D. Menzel et al., IRT Munich Report, 2005, available under http://www.tonmeister.de/symposium/2005/np pdf/RQ4.pdf discloses a system for the reproduction of virtual acoustics in theory and practice. The system combines wave field synthesis, binaural techniques and transaural audio. A stable location for of virtual sources is achieved for listeners that are allowed to turn around and rotate their heads. A circular array located above the head of the listener, and FIR filter coefficients for filters connected to the loudspeakers are calculated based on azimuth information delivered by a head-tracker.
WO 2007/110087 A1 discloses an arrangement for the reproduction of binaural signals (artificial-head signals) by a plurality of loudspeakers. The same crosstalk canceling filter for filtering crosstalk components in the reproduced binaural signals can be used for all head directions. The loudspeaker reproduction is effected by virtual transauralization sources using sound-field synthesis with the aid of a loudspeaker array. The position of the virtual transauralization sources can be altered dynamically, on the basis of the ascertained rotation of the listener's head, such that the relative position of the listener's ears and the transauralization source is constant for any head rotation.
It has been found that the TRM method provides useful results for filter coefficients so that a significant sound focusing effect at predetermined locations can be obtained. However, it has also been found that the TRM method, while effectively applied in medical applications for lithotripsy for example has significant drawbacks in audio applications, where an audio signal comprising music or speech has to be focused. The quality of the signal perceived in the focusing zones and at locations outside the focusing zones is degraded due to significant and annoying pre-echos caused by filter characteristics obtained by the TRM method, since these filter characteristics have a long first portion of the impulse response followed by a “main portion” of the filter impulse response due to the time-reversal process.