Traditionally, audio content of multi-channel format (e.g., stereo, 5.1, 7.1, and the like) or of mono format with metadata is created by mixing different audio signals in a studio, or generated by recording acoustic signals simultaneously in a real environment. The mixed audio signal or content may include a number of different audio objects. Ideally, all of the objects need to be rendered in order to perform a vivid and immersive representation of the audio content over time. The information regarding the audio object can be in the form of metadata, and the metadata may include the position, size (which may include width, depth and height), divergence, etc. of a particular audio object. The more information that is provided, the more accurately the audio objects can be rendered.
If an audio object is to be rendered, some computational resources will be needed. However, when a number of audio objects are included in the audio content, it usually requires a considerable amount of computational resources to correctly render all of the audio objects, namely, to render each and every object with accurate position, size, divergence, and the like. The total computational resources available to render audio content may vary for different systems, and unfortunately the available computational resources provided by some less powerful systems are usually insufficient to render all of the audio objects.
In order to render the audio content successfully by systems with limited computational resources, one existing way is to preset a priority level for each of the audio objects. The priority level is usually preset by the mixer when the audio objects are created or by the system when the audio objects are automatically separated. The priority level represents how important it is to render the particular object in an ideal way, taking all of its metadata into consideration, compared to the other objects. When the total available computational resources are not sufficient to render all of the audio objects, the audio objects with lower priority levels may be discarded in order to save computational resources for those with higher priority levels. By this process, audio objects with higher importance may be rendered while some less important objects may be discarded, so that the audio objects can be selectively rendered with limited supply of computational resources and thus the audio content can be rendered.
However, in some particular time frames when many objects need to be rendered simultaneously, there may be a lot of audio objects discarded, resulting in a low fidelity of the audio reproduction.
In view of the foregoing, there is a need in the art for a solution for allocating the computational resources more reasonably and rendering the audio content more efficiently.