Aspects of the present invention relate generally to the field of image processing, and more specifically to using camera metadata to improve video encoding and decoding processes.
In image processing systems, video or image data may be captured by a camera having a sensor. Conventional cameras convert the captured analog information from the sensor to digital data that is passed to an ISP for signal processing. The processed signal is then passed to a CPU or GPU for additional processing including filtering, encoding, image recognition, pattern or shape recognition, color enhancement, sharpening, or other image enhancing processes.
An encoder may code a source video sequence into a coded representation that has a smaller bit rate than does the source video and thereby achieve data compression. Using predictive coding techniques, some portions of a video stream may be coded independently (intra-coded I-frames) and some other portions may be coded with reference to other portions (inter-coded frames, e.g., P-frames or B-frames). Such coding often involves exploiting redundancy in the video data via temporal or spatial prediction, quantization of residuals and entropy coding. When a new transmission sequence is initiated, the first frame of the sequence is an I-frame. Subsequent frames may then be coded with reference to other frames in the sequence by temporal prediction, thereby achieving a higher level of compression and fewer bits per frame as compared to I-frames. Thus, the transmission of an I-frame requires a relatively large amount of data, and subsequently requires more bandwidth than the transmission of an inter-coded frame.
The resulting compressed data (bitstream) may then be transmitted to a decoding system via a channel. To recover the video data, the bitstream may be decompressed at a decoder by inverting the coding processes performed by the encoder, yielding a recovered decoded video sequence.
Previously coded frames, also known as reference frames, may be temporarily stored for future use in inter-frame coding. A reference frame cache stores frame data that may represent sources of prediction for later-processed frames. Both the encoder and decoder may keep reference frames in a cache or buffer. However, due to constraints in buffer sizes, a limited number of reference frames can be stored in the reference frame cache at a time. Frames that are referenced by other frames may be encoded before the referencing frames to avoid processing delays. Therefore, the coding order of a sequence of frames may be different than the display order of the same sequence.
Brightness or color differences between captured frames may be created by an exposure adjustment or other change in camera capture settings. However, global changes between captured images are often ignored by conventional video compression systems but the differences resulting from such global changes conventionally require the frame implementing the global change to be encoded as an I-frame. Thus, repeated exposure adjustments may require excessive intra-frame coding, thereby limiting the benefit gained by predictively coding transmitted frames.
Therefore, conventional methods for accommodating camera setting changes that affect the overall appearance of the captured video data is expensive in terms of time, processing resources, and transmission bandwidth. Accordingly, there is a need in the art to adapt to changing camera settings by recognizing and accommodating setting changes that alter the global appearance of captured video data between frames.