Audio encoding and decoding systems enabling personalized audio experiences typically need to carry all audio object channels and/or audio speaker channels that are potentially needed for a personalized audio experience. In particular, the audio data/metadata is typically such that parts which are not required for a personalized audio program cannot be easily removed from a bitstream containing such personalized audio program.
Typically, the entire data (audio data and metadata) for an audio program is stored jointly within a bitstream. A receiver/decoder needs to parse at least the complete metadata to understand which parts (e.g. which speaker channels and/or which object channels) of the bitstream are required for a personalized audio program. In addition, stripping off of parts of the bitstream which are not required for the personalized audio program is typically not possible without significant computational effort. In particular, it may be required that parts of a bitstream which are not required for a given playback scenario/for a given personalized audio program need to be decoded. It may then be required to mute these parts of the bitstream during playback in order to generate the personalized audio program. Furthermore, it may not be possible to efficiently generate a sub-bitstream from a bitstream, wherein the sub-bitstream only comprises the data required for the personalized audio program.
The present document addresses the technical problem of providing a bitstream for an audio program, which enables a decoder of the bitstream to derive a personalized audio program from the bitstream in a resource efficient manner.