The invention relates to a method and apparatus for visualizing the directional sound activity of a multichannel audio signal.
Audio is an important medium for conveying any kind of information, especially sound direction information. Indeed, the human auditory system is more effective than the visual system for surveillance tasks. Thanks to the development of multichannel audio format, spatialization has become a common feature in all domains of audio: movies, video games, virtual reality, music, etc. For instance, when playing a First Person Shooting (FPS) game using a multichannel sound system (5.1 or 7.1 surround sound), it is possible to localize enemies thanks to their sounds.
Typically, such sounds are mixed onto multiple audio channels, wherein each channel is fed to a dedicated loudspeaker. Distribution of a sound to the different channels is adapted to the configuration of the dedicated playback system (positions of the loudspeakers), so as to reproduce the intended directionality of said sound.
Multichannel audio streams thus require to be played back over suitable loudspeaker layouts. For instance, each of the channels of a five-channel formatted audio signal is associated with its corresponding loudspeaker within a five-loudspeaker array. FIG. 1 shows an example of a five-channel loudspeaker layout recommended by the International Telecommunication Union (ITU), with a left loudspeaker L, right loudspeaker R, center loudspeaker C, surround left loudspeaker LS and surround right loudspeaker RS, arranged around a reference listening point O which is the recommended listener's position O. With this reference listening point O as a center, the relative angular distances between the central directions of the loudspeakers are indicated.
A multichannel audio signal is thus encoded according to an audio file format dedicated to a prescribed spatial configuration where loudspeakers are arranged at prescribed positions to a reference listening point. Indeed, each time-dependent input audio signal of the multichannel audio signal is associated with a channel, each channel corresponding to a prescribed position of a loudspeaker.
If multichannel audio is played back over an appropriate sound system, i.e. with the required number of loudspeakers and correct angular distances between them, a normal hearing listener is able to detect the location of the sound sources that compose the multichannel audio mix. However, should the sound system exhibit inappropriate features, such as too few loudspeakers, or an inaccurate angular distance thereof, the directional information of the audio content may not be delivered properly to the listener. This is especially the case when sound is played back over headphones.
As a consequence, there is in this case a loss of information since the multichannel audio signal conveys sound direction information through the respective sound levels of the channels, but such information cannot be delivered to the user. Accordingly, there is a need for conveying to the user the sound direction information encoded in the multichannel audio signal.
Some methods have been provided for conveying directional information related to sound through the visual modality. However, these methods were often a mere juxtaposition of volume meters, each dedicated to a particular loudspeaker, and thus unable to render precisely the simultaneous predominant direction of the sounds that compose the multichannel audio mix except in the case of one unique virtual sound source whose direction coincides with a loudspeaker direction. Other methods intended to more precisely display sound locations are so complicated that they reveal themselves inadequate since sound directions cannot be readily derived by a user.
For example, U.S. patent application US 2009/0182564 describes a method wherein sound power level of each channel is displayed, or alternatively wherein position and power level of elementary sound components are displayed.
U.S. Pat. No. 9,232,337 B2 describes a method for visualizing a directional sound activity of a multichannel audio signal that displays a visualization of a directional sound activity of the multichannel audio signal through a graphical representation of directional sound activity level within a sub-division of space. For a channel and for a frequency sub-band, a sound activity vector is formed by associating the sound activity level corresponding to the frequency-domain signal of said channel and said sub-band to the unit vector corresponding to the spatial information associated with said channel. In an embodiment of this patent, the energy vector sum representative for the perceived directional energy is directly calculated using Gerzon's energy vectors, as a mere summation of the sound activity vectors related to the channels for said frequency sub-band. This directional sound activity vector represents the predominant sound direction that would be perceived by a listener according to the recommended loudspeaker layout for sounds within that particular frequency sub-band.
However, if this method visually renders the main sound direction, it may not always achieve optimal results for a user. Indeed, this method does not exploit diffuse sounds, but focuses on identifying and displaying the main sound directions, regardless of the nature of the sound (directivity or diffuseness). As a result, when the sound is very diffuse, it may not be able to correctly extract a useful main sound direction from the noisy environment.