Videos and images in a low light environment typically sacrifice resolution for improved sensitivity by using sensors with large pixel size and thus low pixel count for the same size of the sensor. The low number of photons in the low light environment need a large pixel size to improve light sensitivity. However, for a camera with a given form factor, increasing pixel size can result in degradation in image quality due to reduction in pixel count. Increasing the exposure time of a camera to allow collection of more photons reduces noise, but large exposure time values could increase motion blur due to potential object motion within the exposure period. Several solutions have been proposed to remedy image resolution in a low light environment. For example, the conventional up-sampling and interpolation methods create “fake pixels” to increase pixel count, but these interpolated pixels are not true pixels, rather an approximation of a true pixel, based on the neighboring pixels.
Other methods may use multiple or moving cameras to capture multiple images from different angles and combine them to enhance the resolution. However, the multiple camera approach requires multiple lenses and multiple sensors, thereby increasing the cost and size of the camera system. Moving cameras, or moving sensor requires precise shifting of the whole camera or of the sensor, which substantially increases the cost and degrades the reliability of the camera system.
Super resolution methods such as “dictionary learning” predict a high resolution image from a single-low resolution image, but they are computationally intensive and require a large memory.
Accordingly, there is a need for an enhanced video image processing system and technique that produces an increased image resolution based on the image(s) captured by the sensor, without requiring a complex architecture, large memory, and/or high processing power.