Video streaming systems stream video-related data over a communication network for applications such as video conferencing, and on-demand viewing of media and sporting events. A multi-camera video streaming system uses an array of cameras to capture multiple video streams, and the video viewers can use client software to receive the video streams and selectively change the view angles to watch the video content taken from different viewpoints. During a view change, it may be desirable for the viewer to see view sweeping effects (e.g., a “freeze” time effect or a “Dolly” effect), so that they can experience smooth view change.
However, there can be issues associated with the view switch of the camera array streaming system. One issue is the smoothness of the view change. Typically, when the viewer chooses to switch from one view to another, the images captured by the cameras between the two specified camera views also need to sequentially delivered to the client application side of the system so that the viewer can see the view-sweeping effect, and therefore experience smooth view change. However, for a Video-on-Demand streaming system, the captured videos are compressed (e.g. typically by temporal Group-of-Picture (GOP) based compression schemes, such as H.264) and saved as compressed files. If the client viewer needs to produce the view-sweeping effect, they need to download all the corresponding video segments from different views, extract corresponding frames and concatenate them for playback. This is only feasible if the network is very fast, and has low delay. Another issue is the initial delay of the view change, which is defined as the duration between the time when the user chooses to change the view and the time when the user actually sees the view change or the start of the view-sweeping effect. The initial delay could significantly impact user experience because it could result in freezing frames on the screen.