In video encoding according to, for example, the MPEG (Moving Picture Experts Group) standards, the evaluation function used in motion estimation is the sum of absolute difference of brightness values between so-called “macroblocks”. One macroblock refers to a rectangular area of a specific size. In the case where the size of a macroblock is 1×m, the sum of absolute difference SAD is given by Equation 1 below.
                              SAD          ⁡                      (                                          n                ′                            ,                              x                ′                            ,                              y                ′                                      )                          =                              ∑                          i              =              0                        l                    ⁢                                          ⁢                                    ∑                              j                =                0                            m                        ⁢                                                  ⁢                                                                                              f                    n                                    ⁡                                      (                                                                  x                        +                        i                                            ,                                              y                        +                        j                                                              )                                                  -                                                      f                                          n                      -                                              n                        ′                                                                              ⁡                                      (                                                                  x                        +                                                  x                          ′                                                +                        i                                            ,                                              y                        +                                                  y                          ′                                                +                        j                                                              )                                                                                                                                                    [                  Equation          ⁢                                          ⁢          1                ]            
Note that the notation | | represents an absolute value. Further, fn(x, y) represents the brightness value at the coordinates (x, y) on the image frame having the frame number n (i.e., the image frame to be encoded), whereas fn-n′ (x, y) represents the brightness value at the coordinates (x, y) on the image frame having the frame number “n−n′” (i.e., the reference frame).
In the motion estimation, a combination of n′, x′, and y′ which results in the smallest sum of absolute difference SAD is searched to determine the reference frame and to estimate the motion vector. In a method such as MPEG-4 AVC (H.264), the value of n′ may be designated arbitrarily to some extent. In a method such as MPEG-4 Simple Profile, the value of n′ is fixed to “1”.
The brightness value of the light source at the time of image capturing may differ between the image frame having the frame number n and the image frame having the frame number “n−n′”. In such a case, the sum of absolute difference given by Equation 1 may become minimum between the image frames that happen to be close to each other in the brightness values irrespectively of the move of objects appearing in the image frames, which leads to decrease the compression ratio.
In one technology suggested for removing fluorescent flicker, quadratic or higher order spectrum analysis is conducted on an image captured by an imaging unit and then the image is corrected (see Patent Literature 1, for example). The overview of the suggested technology is described with reference to FIG. 10.
An imaging device 100 operates as follows. First, an imaging unit 101 constructed of a CMOS (Complementary Metal Oxide Semiconductor) image sensor captures images. A brightness detecting unit 102 measures the brightness of each captured image on a line-by-line basis. A flicker spectrum detecting unit 103 analyzes the frequency components of the brightness values using the Fourier transform to extract fluorescent flicker components, thereby estimating the waveform representing the fluorescent flicker. Based on the estimated waveform representing the fluorescent flicker, a correction coefficient generating unit 104 generates such a brightness correction coefficient that would eliminate the fluorescent flicker components. A gain adjustment unit 105 adjusts the gain of the captured image based on the brightness correction coefficient to correct the brightness. Although not specifically designed for high-speed image capturing, the imaging device 100 is enabled to perform brightness correction more accurately, by addressing the deviation in image capturing timing among pixels, which is a problem inherent to CMOS image sensors, and estimating the flicker spectrum covering high frequency components.
Further, in a technology suggested for video encoding in consideration of the change of the light source brightness between image frames, the brightness of the reference frame is adjusted to match the brightness value of the image frame to be encoded, prior to the motion estimation (see Patent Literature 2, for example). The overview of the suggested technology is described with reference to FIG. 11.
An imaging device 200 operates as follows. First, an imaging unit 201 captures an image and records the captured image into an input frame recording unit 202. A gain detection unit 208 detects the difference in brightness between the image frame held in the input frame recording unit 202 and a reference frame held in a reference frame recording unit 207. A gain adjustment unit 209 adjusts the brightness of the reference frame based on the detected brightness difference. A motion estimation unit 203 performs motion estimation between the image frame to be encoded and the adjusted reference frame. An encoding unit 204 encodes a signal acquired as a result of the motion estimation to store encoded data in a recording unit 205. Further, a decoding unit 206 decodes the encoded data to obtain a new reference frame and stores the reference frame in the reference frame recording unit 207. In the above described manner, the imaging device 200 is enabled to maintain high compression ratio, even if the brightness value among image frames varies under the influence of fluorescent flicker, for example.
In one technology belonging to the state of the art relating to parallelizing video encoding, a plurality of encoding units are employed to operate in parallel (see Patent Literature 3, for example). The overview of this technology is described with reference to FIG. 12. An imaging device shown in FIG. 12 operates as follows. First, an imaging unit 301 captures a video and a distribution unit 302 distributes the video by in frames among encoding units 303, 304, and 305. The encoding units 303, 304, and 305 encode the received frames and output encoded data to a concatenating unit 306 where the respective pieces of encoded data are combined and stored in a recording unit 307. This technology is effective to improve the encoding speed and thus useful in such applications as high-speed image capturing, which requires high encoding capability.