The introduction of digital cinema and the development of true three-dimensional (“3D”) or virtual 3D content has created new standards for sound, such as the incorporation of multiple channels of audio to allow for greater creativity for content creators and a more enveloping and realistic auditory experience for audiences. Expanding beyond traditional speaker feeds and channel-based audio as a means for distributing spatial audio is critical, and there has been considerable interest in a model-based audio description that allows the listener to select a desired playback configuration with the audio rendered specifically for their chosen configuration. The spatial presentation of sound utilizes audio objects, which are audio signals with associated parametric source descriptions of apparent source position (e.g., 3D coordinates), apparent source width, and other parameters. Further advancements include a next generation spatial audio (also referred to as “adaptive audio”) format has been developed that comprises a mix of audio objects and traditional channel-based speaker feeds along with positional metadata for the audio objects. In a spatial audio decoder, the channels are sent directly to their associated speakers or down-mixed to an existing speaker set, and audio objects are rendered by the decoder in a flexible (adaptive) manner. The parametric source description associated with each object, such as a positional trajectory in 3D space, is taken as an input along with the number and position of speakers connected to the decoder. The renderer then utilizes certain algorithms, such as a panning law, to distribute the audio associated with each object across the attached set of speakers. The authored spatial intent of each object is thus optimally presented over the specific speaker configuration that is present in the listening room.
The advent of advanced object-based audio has significantly increased the complexity of the rendering process and the nature of the audio content transmitted to various different arrays of speakers. For example, cinema sound tracks may comprise many different sound elements corresponding to images on the screen, dialog, noises, and sound effects that emanate from different places on the screen and combine with background music and ambient effects to create the overall auditory experience. Accurate playback requires that sounds be reproduced in a way that corresponds as closely as possible to what is shown on screen with respect to sound source position, intensity, movement, and depth.
Although advanced 3D audio systems (such as the Dolby® Atmos™ system) have largely been designed and deployed for cinema applications, consumer level systems are being developed to bring the cinematic adaptive audio experience to home and office environments. As compared to cinemas, these environments pose obvious constraints in terms of venue size, acoustic characteristics, system power, and speaker configurations. Present professional level spatial audio systems thus need to be adapted to render the advanced object audio content to listening environments that feature different speaker configurations and playback capabilities. Toward this end, certain virtualization techniques have been developed to expand the capabilities of traditional stereo or surround sound speaker arrays to recreate spatial sound cues through the use of sophisticated rendering algorithms and techniques such as content-dependent rendering algorithms, reflected sound transmission, and the like. Such rendering techniques have led to the development of DSP-based renderers and circuits that are optimized to render different types of adaptive audio content, such as object audio metadata content (OAMD) beds and ISF (Intermediate Spatial Format) objects. Different DSP circuits have been developed to take advantage of the different characteristics of the adaptive audio with respect to rendering specific OAMD content. However, such multi-processor systems require optimization with respect to memory bandwidth and processing capability of the respective processors.
What is needed, therefore is a system that provides a scalable processor load for two or more processors in a multi-processor rendering system for adaptive audio.
The increased adoption of surround-sound and cinema-based audio in homes has also led development of different types and configurations of speakers beyond the standard two-way or three-way standing or bookshelf speakers. Different speakers have been developed to playback specific content, such as soundbar speakers as part of a 5.1 or 7.1 system. Soundbars represent a class of speaker in which two or more drivers are collocated in a single enclosure (speaker box) and are typically arrayed along a single axis. For example, popular soundbars typically comprise 4-6 speakers that are lined up in a rectangular box that is designed to fit on top of, underneath, or directly in front of a television or computer monitor to transmit sound directly out of the screen. Because of the configuration of soundbars, certain virtualization techniques may be difficult to realize, as compared to speakers that provide height cues through physical placement (e.g., height drivers) or other techniques.
What is further needed, therefore, is a system that optimizes adaptive audio virtualization techniques for playback through soundbar speaker systems.
The subject matter discussed in the background section should not be assumed to be prior art merely as a result of its mention in the background section. Similarly, a problem mentioned in the background section or associated with the subject matter of the background section should not be assumed to have been previously recognized in the prior art. The subject matter in the background section merely represents different approaches, which in and of themselves may also be inventions. Dolby, Dolby TrueHD, and Atmos are trademarks of Dolby Laboratories Licensing Corporation.