Virtual reality (VR) is a well-known technical area and generally refers to various capture and rendering technologies for generating realistic images and sounds that replicate a real or imaginary environment, and simulate a user's physical presence in this environment. Virtual reality may be considered as a content consumption space. In a three dimensional (3D) space there are six degrees of freedom defining the way the user may move within that 3D space. Often researchers divide this movement into two categories: rotational movement and translational movement, where each category of movement has three degrees of freedom in the space. Rotational movement is often sufficient for a simple VR experience where the user may turn her head (pitch, yaw, and roll) to experience the space from a static or automatically moving point. Translational movement means that the user may also change the position of the rendering, that is, the user moves along the x, y, and z axes of a Cartesian system. Augmented reality (AR) shares many similarities with VR. Typically it refers to a direct or indirect view of a physical, real-world environment to which computer-generated sensory input such as sound and graphics is added. Embodiments of these teachings relate to rendering of six-degrees-of-freedom (6 DoF) audio, which is also known as free-viewpoint or free-listening point audio. Certain use cases for AR/VR audio allow for user movement that is at least substantially free and where at least the audio is rendered to the user according to the user's head rotation as well as location in the audio content space. This spatial audio may for example consist of a channel-based bed and audio objects, audio objects only, or similar spatial audio representation. The audio objects may be static or they may be dynamic in the sense that their default location in the 3D space may be time-variant.
Phase cancellation is a known problem in various audio applications ranging from capture and recording, through mixing, to audio presentation. For example, when considering a phase problem in stereo loudspeaker presentation, phase cancellation typically manifests itself in the low-frequency sounds that appear thin with little or no bass sound. As another example, a bass guitar in a musical piece may lack localization, and the sound object rendered as an audio object may appear to be moving rather than emitting sound from a virtual single point in Cartesian space. Considering loudspeaker presentation, audio content from music to movies is typically mixed such that the audio sound is optimized for some “sweet spot”, which typically is adapted some such that not only a single listening (seat) position is optimized but rather a reasonably good sound quality is achieved for a target listening area that may for example span several seat positions.
FIGS. 1B through 1G illustrate various effects of phase addition/cancellation to a simple reference wave shown at FIG. 1A, as known in the art. FIG. 1A illustrates the reference signal; FIGS. 1B, 1D and 1F each present a non-shifted (1B) or shifted (1D, 1F) version of that reference signal where the specific shifts are indicated in the figure. The combined signals of FIG. 1C, 1E and 1G are achieved by summing the reference signal of FIG. 1A with the respective second signals of FIGS. 1B, 1D and 1F, respectively. FIG. 1C illustrates the reinforcing effect of summing in-phase signals, while FIG. 1E illustrates that a considerable shift may have a fairly small effect. FIG. 1G illustrates the case of exact phase cancellation due to summing the reference signal with a 180° shifted version of itself.
Six degrees of freedom audio such as AR/VR audio may often suffer from unwanted phase cancellation effects due to the nature of the audio that is used for the productions as well as the nature of the use case itself. Freedom of the user means less control by the content creator over unwanted phase additions/cancellations. Embodiments of these teachings provide a method and apparatus to gain back some of that control that is lost in the 6 degrees of freedom environment.
Some prior art teachings that have some background relevance can be seen at U.S. Pat. Nos. 9,271,080; 9,332,370; 9,332,370; 9,111,522; 9,396,731; 9,154,897; 6,307,941; US patent publication no. 2016/0330548; and a paper entitled INVERSE FILTER DESIGN FOR IMMERSIVE AUDIO RENDERING OVER LOUDSPEAKERS by A. Mouchtaris, et al. (IEEE Transactions on Multimedia; June 2000).