It has been proposed to employ layered coding of audio and video transmitted in video conferencing and telephone conferencing systems. For example, U.S. Pat. No. 7,593,032, issued Sep. 22, 2009 to Civanlar et al., discloses a video conferencing system in which transmitted audio and video are encoded using a layered coding scheme, and in which all or some of the layers of a full set of layers of the encoded video or audio may be transmitted.
It has also been proposed to encode audio data so that the encoded audio includes a monophonic layer and directional metadata which can be used (e.g., in a tele-conferencing system) to render the monophonic audio content as an output sound field (e.g., for playback on multiple loudspeakers). See, for example, V. Pulkki, et al., “Directional Audio Coding. Perception-based Reproduction of Spatial Sound,” in International Workshop on the Principles and Applications of Spatial Hearing, Nov. 11-13, 2009, Zao, Miyagi, Japan.
However, until the present invention, it had not been known how to provide a spatially layered, encoded audio signal, with layers enabling a variety of benefits (described hereinbelow), including provision of a perceptually continuous tele-conferencing listening experience at endpoints of a tele-conferencing system, or to provide a spatially layered, encoded audio signal so as to provide a mix of sound field and monophonic layers which varies over time (e.g., in a continuous manner) to endpoints of a conferencing system during a tele-conference.
All the figures are schematic and generally only show parts which are necessary in order to elucidate the invention, whereas other parts may be omitted or merely suggested. Unless otherwise indicated, like reference numerals refer to like parts in different figures.