Many smartphones and other mobile computing devices include one or more cameras, which are operable by a user to capture images and/or video of a desired scene. In some devices, the camera(s) may be embodied as a three-dimensional camera capable of capturing three-dimensional images and videos, which include depth data associated with the captured image or video. The depth data included in three-dimensional images allow users to perform certain post-capture enhancement of the captured images and/or video. For example, a user may select a particular area of the image to enhance (e.g. zoom into) or adjust the focal point of the original image to a desired region based on the depth data.
Although a three-dimensional image or video may be modified as discussed above, any audio associated with the three-dimensional image or video is not modified in a similar way. As such, the resulting enhanced image may include audio that is the same as the audio of the original image. In some situations, the original audio may not correlate correctly or in the desired manner to the enhanced three-dimensional video, which can cause confusion or otherwise lessen the playback experience of the enhanced three-dimensional video.