Digital video cameras are increasingly spreading on the marketplace. The latest mobile phones are equipped with video cameras offering users the capabilities to shoot video clips and send them over wireless networks.
Digital video sequences are very large in file size. Even a short video sequence is composed of tens of images. As a result video is always saved and/or transferred in compressed form. There are several video-coding techniques that can be used for that purpose. H.263 and MPEG-4 are the most widely used standard compression formats suitable for wireless cellular environments.
To allow users to generate quality video at their terminals, it is imperative that devices having video camera, such as mobile phones, provide video editing capabilities. Video editing is the process of transforming and/or organizing available video sequences into a new video sequence.
Existing cameras on mobile phones are not comparable in performance to the most sophisticated digital cameras available in the market. As a result, video captured from such cameras usually suffers from calibration that results in degraded brightness and contrast levels, as well as deficient color balance. As a result, among the most widely needed operations in video editing is the enhancement of the visual perceptual quality of video. This includes adjusting the brightness and contrast levels of the video clip.
Adjusting the brightness and contrast of a still image requires changing the image coefficients, which is usually done in the spatial domain. For constrained mobile devices, adjusting the brightness or contrast of a video (which comprises of hundreds of frames) is very costly and taxing on the system resources. This becomes an even bigger concern when we consider that the user may experiment with the adjustment level many times before achieving the desired result.
Most video editing tools enable users to apply image enhancement effects on the image. An example is to increase or decreases the brightness when the original video is too dark or too bright. In the video editing tools, similar operations are required to produce a better representation of the video clips.
Several commercial products support such features, but they are mainly targeted for the PC platform. These products adopt a straightforward approach by applying the video enhancement effects in spatial domain. More specifically, they first decompress the video clips to their raw format, then manipulate the pixel values of the raw image sequences, and finally compress the enhanced raw image sequences into the compressed bitstream. This process is called spatial domain video editing.
Spatial domain video editing, however, consumes a large number of resources, including memory, storage, and computational power. Although, this is not such a big issue for today's desktop PCs, but it is definitely a problem for mobile devices that are equipped with low-power processors and low memory and storage resources. The decoding and encoding process takes a long time and consumes a lot of battery power in these devices. The spatial domain scheme, therefore, is not a viable solution for mobile devices.
In prior art, to perform brightness or contrast adjustment on video clips, the video clips are first decoded to the raw format. Then, the raw image sequences are adjusted to the designated brightness or contrast level. Finally, the enhanced raw image sequences are encoded again. This approach is significantly computational intensive, especially the encoding part.
An example of spatial domain video editing of brightness and contrast adjustment is given below, with reference to FIG. 2.
The brightness adjustment refers to the cases of increasing or decreasing the luminance intensity of the video clip by a constant value. The contrast adjustment refers to the cases of stretching the difference between luminance intensities within a frame.
To achieve these brightness and contrast adjustments in the spatial domain, once the video is fully decoded, the following operation is performed:{tilde over (V)}(x,y,t)=V(x,y,t)+K  (1){tilde over (V)}(x,y,t)=λ·[V(x,y,t)−η]+η  (2)where (1) represents the brightness adjustment and (2) represents the contrast adjustment, V(x,y,t) is the decoded video sequence, {tilde over (V)}(x,y,t) is the edited video, x, y are the spatial coordinates of the pixels in the frames and t is the temporal axis. K is the brightness adjusting value, which is constant for all pixels in the frame. A positive value of K will make the video brighter, while a negative value of K will make the video darker. λ>0 is the stretching factor for contrast adjustment, which is constant for all pixels in the frame. If λ is larger than 1, the resulting video has a higher contrast level, while if λ is between 0 and 1, the resulting video has a lower contrast level. A value of λ=1 does not result in any change in the image. η represents the mean of pixel intensities in a particular frame. Equation (2) shows that for the contrast adjustment the pixel intensities are uniformly stretched; the stretch centre is the mean of the pixel intensities.
After modifying the pixel values of a video frame, the resulting frame is fed to the encoder for re-encoding, which is a time consuming process.