Spatial audio applications have become numerous and widespread and increasingly form at least part of many audiovisual experiences. Indeed, new and improved spatial experiences and applications are continuously being developed which results in increased demands on the audio processing and rendering.
For example, in recent years, Virtual Reality (VR) and Augmented Reality (AR) have received increasing interest and a number of implementations and applications are reaching the consumer market. Indeed, equipment is being developed for both rendering the experience as well as for capturing or recording suitable data for such applications. For example, relatively low cost equipment is being developed for allowing gaming consoles to provide a full VR experience. It is expected that this trend will continue and indeed will increase in speed with the market for VR and AR reaching a substantial size within a short time scale.
The concept of Virtual Reality or Augmented Reality encompasses a very wide field of concepts. It may include fully immersive scenarios where the user navigates in a 3D virtual world as he would in real-life (e.g. looking around by physically moving his head, or even physically walking around), or may e.g. include simpler scenarios where navigation in the virtual world is done by means of explicit controls.
However, most of the effort so far has concentrated on the visual side of the provided experience, i.e. it has concentrated on developing approaches for capturing and rendering three dimensional adaptive visual experiences.
For example, various systems for 360-degree (2D and 3D) video capturing have recently been developed. A particularly interesting VR video capturing technology is the so-called “light field camera” (also known as “plenoptic” camera). Such cameras do not simply capture the light intensity of a scene in an image, but also capture the direction from which light reaches the camera. This allows various types of post-processing of the recorded image. In particular, it allows the focal plane of the image to be changed after the image has been recorded. In practical terms, this means that it is possible to change the in-focus distance (relative to the camera standpoint) at the time of rendering the image.
It has been proposed to provide a spherical camera system for VR applications, consisting of multiple light field cameras in a spherical arrangement. Such a camera system enables capturing of 360-degree 3D panorama recordings while making it possible to change the focal distance and/or zoom in post-processing.
Such developments on the video side opens up a range of possibilities for generating immersive and interactive visual content and experiences. However, in general, less interest has been focused on providing improved and more suitable spatial audio experiences. Indeed, typically the audio solutions are less adaptive and tend to mainly use a conventional spatial audio experience where the only adaptability may be that the position of some audio sources can be changed.
Hence, an improved spatial audio system would be advantageous and in particular an audio processing approach allowing increased flexibility, improved adaptability, an improved virtual reality experience, improved performance, increased user control or adaptation, user side manipulation, and/or an improved spatial audio experience would be advantageous.