The subject matter discussed in the background section should not be assumed to be prior art merely as a result of its mention in the background section. Similarly, a problem mentioned in the background section or associated with the subject matter of the background section should not be assumed to have been previously recognized in the prior art. The subject matter in the background section merely represents different approaches, which in and of themselves may also be inventions.
Cinema sound tracks usually comprise many different sound elements corresponding to images on the screen, dialog, noises, and sound effects that emanate from different places on the screen and combine with background music and ambient effects to create the overall audience experience. Accurate playback requires that sounds be reproduced in a way that corresponds as closely as possible to what is shown on screen with respect to sound source position, intensity, movement, and depth. Traditional channel-based audio systems send audio content in the form of speaker feeds to individual speakers in a playback environment.
The introduction of digital cinema has created new standards for cinema sound, such as the incorporation of multiple channels of audio to allow for greater creativity for content creators, and a more enveloping and realistic auditory experience for audiences. Expanding beyond traditional speaker feeds and channel-based audio as a means for distributing spatial audio is critical, and there has been considerable interest in a model-based audio description that allows the listener to select a desired playback configuration with the audio rendered specifically for their chosen configuration. To further improve the listener experience, playback of sound in true three-dimensional (“3D”) or virtual 3D environments has become an area of increased research and development. The spatial presentation of sound utilizes audio objects, which are audio signals with associated parametric source descriptions of apparent source position (e.g., 3D coordinates), apparent source width, and other parameters. Object-based audio may be used for many multimedia applications, such as digital movies, video games, simulators, and is of particular importance in a home environment where the number of speakers and their placement is generally limited or constrained by the confines of a relatively small listening environment.
Various technologies have been developed to improve sound systems in cinema environments and to more accurately capture and reproduce the creator's artistic intent for a motion picture sound track. For example, a next generation spatial audio (also referred to as “adaptive audio”) format has been developed that comprises a mix of audio objects and traditional channel-based speaker feeds along with positional metadata for the audio objects. In a spatial audio decoder, the channels are sent directly to their associated speakers (if the appropriate speakers exist) or down-mixed to an existing speaker set, and audio objects are rendered by the decoder in a flexible manner. The parametric source description associated with each object, such as a positional trajectory in 3D space, is taken as an input along with the number and position of speakers connected to the decoder. The renderer then utilizes certain algorithms, such as a panning law, to distribute the audio associated with each object across the attached set of speakers. This way, the authored spatial intent of each object is optimally presented over the specific speaker configuration that is present in the listening room.
Current spatial audio systems have generally been developed for cinema use, and thus involve deployment in large rooms and the use of relatively expensive equipment, including arrays of multiple speakers distributed around the room. An increasing amount of cinema content that is presently being produced is being made available for playback in the home environment through streaming technology and advanced media technology, such as blu-ray, and so on. In addition, emerging technologies such as 3D television and advanced computer games and simulators are encouraging the use of relatively sophisticated equipment, such as large screen monitors, surround-sound receivers, and speaker arrays in home and other consumer (noncinema/theater) environments. However, equipment cost, installation complexity, and room size are realistic constraints that prevent the full exploitation of spatial audio in most home environments. For example, advanced object-based audio systems typically employ overhead or height speakers to play back sound that is intended to originate above a listener's head. In many cases, and especially in the home environment, such height speakers may not be available. In this case, the height information is lost if such sound objects are played only through floor or wall-mounted speakers.
What is needed therefore is a system that allows full spatial information of an adaptive audio system to be reproduced in various different listening environments, such as collocated speaker systems, headphones, and other listening environments that may include only a portion of the full speaker array intended for playback, such as limited or no overhead speakers.