A video camera has limited temporal resolution determined by a frame rate and an exposure time. Temporal events of a scene occurring at a rate faster than the frame rate of the camera can cause aliasing and blur in acquired frames of the videos due to finite integration time of a sensor of the camera. The blur can be caused, e.g., by motion of objects and/or by a temporal change of intensities in the scene, e.g., a flickering light or a display screen. The goal of temporal super-resolution (SR) is to generate the high temporal resolution video without aliasing and blur.
A frame time Tf of the camera, i.e., inverse of the frame rate, determines how fast the sensor samples the temporal variations at each pixel, while the integration time T determines how long the sensor of the camera integrates at that sampling rate. Let Tfnq be Nyquist frame time required to avoid aliasing, i.e., the sensor acquires a sample at least every Tfnq seconds to avoid aliasing.
Depending on the relationship between the integration T, the frame time Tf and the Nyquist frame time Tfnq, the acquired frames of the video can either have blur, aliasing, or a combination of both. The well-known “wagon wheel” effect happens when Tf>Tfnq, i.e., the wheel appears to be rotating in a reverse direction due to temporal aliasing. The temporal aliasing can occur concurrently with the blur when the integration time T increases.
A high speed camera avoids both the blur and the aliasing by sampling faster than the Nyquist frame rate, while keeping the integration time T sufficiently small. However, the high speed camera has a fundamental light capture limit. If the frame rate is f frames/sec, then the exposure duration cannot be greater than 1/f sec. In addition, commercial high speed cameras are expensive, and require a large bandwidth and local memory.
Multiple Cameras
Point Sampling
FIG. 1A shows a conventional point sampling method to generate an output video 110 having a high temporal resolution from multiple input videos having low temporal resolutions. Using N cameras 120 each with a frame rate f, the output video with an effective frame rate of Nf can be recovered by staggering the start of exposure window of each camera by
  1  Nfand interleaving 130 the acquired frames in a chronological order.
Each of the N cameras has a frame time Tfin and integration time Tin=Tfin/N. The output video has the frame time Tfout=Tfin/N. To avoid blur, Tout=Tin and Tin is small.
The advantage of the point sampling method is that the reconstruction process simply involves interleaving the acquired frames, thus avoiding reconstruction artifacts. However, the exposure time of each camera is
      1    Nf    ,i.e., similar to an equivalent high speed camera, and, thus the point sampling method is light-inefficient.
Box Sampling
FIG. 1B shows a conventional box sampling method that combines several low temporal resolution videos 120 to generate the high temporal resolution video 110 using an optimization framework. That method allows a finite integration time to collect more light, which leads to motion blur in the videos 120. The finite integration time acts as a low pass box filter and suppresses high temporal frequencies. The box sampling uses regularization to solve the resulting ill-posed linear system 140 and to suppress ringing artifacts. However, recovering the lost high frequency information is inherently an ill-posed problem. Moreover, using N cameras, it is difficult to achieve the temporal SR by a factor of N. In addition, the reconstruction requires solving a huge sparse linear system (with million variables) for a video of a modest size.
To achieve the temporal SR, it is important to consider both the increase in frame rate and/or the decrease in frame time Tf, and decrease in the integration time T. For example, the decrease of the integration time T of a single camera, as in the point sampling, reduces the motion blur, but results in the aliasing, because the frame rate is not increased. Similarly, the interleaving frames from N cameras as in the box sampling, increases the frame time in the output video, but temporal blur remains due to relatively large integration time.
Accordingly, the goal of temporal SR is to both reduce the aliasing and the blur in the reconstructed output video.