In the current definition of HOA layered coding, side information for the HOA decoding tools Spatial Signal Prediction, Sub-band Directional Signal Synthesis and Parametric Ambience Replication (PAR) Decoder is created to enhance a specific HOA representation. Namely, in the current definition of the layered HOA coding the provided data only properly extends the HOA representation of the highest layer (e.g., the highest enhancement layer). For the lower layers including the base layer these tools do not enhance the partially reconstructed HOA representation properly.
The tools Sub-band Directional Signal Synthesis and Parametric Ambience Replication Decoder are specifically designed for low data rates, where only a few transport signals are available. However, in HOA layered coding proper enhancement of (partially) reconstructed HOA representations is not possible especially for the low bitrate layers, such as the base layer. This clearly is undesirable from the point of view of sound quality at low bitrates.
Additionally, it has been found that the conventional way of treating the encoded V-vector elements for the vector based signals does not result in appropriate decoding if a CodedVVecLength equal to one is signaled in the HOADecoderConfig( ) (i.e., if the vector coding mode is active). In this vector coding mode the V-vector elements are not transmitted for HOA coefficient indices that are included in the set of ContAddHoaCoeff. This set includes all HOA coefficient indices AmbCoeffIdx[i] that have an AmbCoeffTransitionState equal to zero. Conventionally, there is no need to also add a weighted V-vector signal because the original HOA coefficient sequence for these indices are explicitly sent (signaled). Therefore the V-vector element is set to zero for these indices.
However, in the layered coding mode the set of continuous HOA coefficient indices depends on the transport channels that are part of the currently active layer. Additional HOA coefficient indices that are sent in a higher layer may be missing in lower layers. Then the assumption that the vector signal should not contribute to the HOA coefficient sequence is wrong for the HOA coefficient indices that belong to HOA coefficient sequences included in higher layers.
As a consequence, the V-vector in layered HOA coding may not be suitable for decoding of any layers below the highest layer.
Thus, there is need for coding schemes and bitstreams that are adapted to layered coding of compressed HOA representations of a sound or sound field.
The present document addresses the above issues. In particular, methods and encoders/decoders for layered coding of frames of compressed HOA sound or sound field representations as well as data structures for representing frames of compressed HOA sound or sound field representations are described.