Modern video capture devices make use of a variety of different types of technologies for capturing, processing, and transmitting or otherwise transferring captured video data. A video capture device may be designed to capture primarily visible light video, depth-based video, infrared-based video, or a combination thereof. However, in some cases, the technologies used by video capture devices to process and/or transfer the captured video data limit the quality of the resulting video stream and/or negatively affect the user experience of watching the resulting video stream. Device clocks of video capture devices may drift, causing temporal distortion or “jitter” in the video stream, video processing operations may cause the video stream to skew out of sync, and/or a communication bus or other medium may provide less reliable data transfer than necessary to provide a high-quality video stream to users on the other end of the transfer. For example, use of universal serial bus (USB) technology to transfer frames of video stream may present a variety of challenges associated with accurate timing of the frames in the video stream after transfer.
These challenges are compounded when multiple video streams are combined to form three-dimensional video streams, video streams with multiple points of view, or the like. Synchronizing each video stream requires extremely accurate timing of each frame, and each video stream may be captured by different video capture devices, each with a unique set of technologies that must be considered in order for the video streams to be correlated to a shared timeframe. Providing combined video streams for viewing without significant temporal distortion, jitter, or other issues that cause a negative user experience is a challenging task.