1. Field of the Invention
The present invention relates to a technique of suppressing degradation of image quality caused by a foreign substance adhering to the surface of an optical low-pass filter or the like arranged in front of an image sensor such as a CCD or CMOS sensor in an image capturing apparatus and, more particularly, to a technique of suppressing degradation of image quality caused by a foreign substance when recording a moving image.
2. Description of the Related Art
When the lens is detached from the camera body of a lens-interchangeable digital camera, mote floating in air may enter the camera body. The camera incorporates various mechanical units such as a shutter mechanism which mechanically operate. When these mechanical units operate, dust such as metal powder may be generated in the camera body.
When a foreign substance such as dust or mote adheres to the surface of an optical low-pass filter which is an optical element arranged in front of an image sensor and forms the image capturing unit of the digital camera, the shadow of the foreign substance is included in a captured image, degrading the image quality.
A camera using a silver halide film feeds the film in every shooting. Hence, images never contain the same foreign substance at the same position continuously. However, the digital camera requires no operation of feeding the film frame in every shooting, and therefore, captured images continuously contain the same foreign substance at the same position.
To solve this problem, there is proposed a method of correcting a pixel capturing the image of a foreign substance by using the signals of neighboring pixels or the like. As a technique of correcting such a pixel, for example, Japanese Patent Laid-Open No. 6-105241 proposes a pixel defect correction method of correcting the pixel defect of an image sensor. Japanese Patent Laid-Open No. 2004-242158 proposes a method of changing the extension or the like of an image file recorded in a dust acquisition mode from that of a normal image in order to simplify setting of position information of a pixel defect. According to this method, a PC (Personal Computer) automatically determines a dust information image, and corrects a target image using the information.
Recently, a technique of handling moving image information as digital data, and encoding it at high compression ratio with high image quality to accumulate and transmit the encoded data has been proposed and become popular.
Motion JPEG (Joint Photographic Experts Group) encodes a moving image by applying still image encoding (e.g., JPEG encoding) to each frame. Although JPEG encoding basically targets still images, products which apply JPEG encoding to even moving images by high-speed processing have come into practical use.
An outline of JPEG encoding will be explained briefly. Image data is divided into blocks of a predetermined size (e.g., blocks each having 8×8 pixels). Each block undergoes 2D discrete cosine transform, and the transform coefficient is quantized linearly or non-linearly. The quantized transform coefficient undergoes Huffman coding (variable length coding). More specifically, the difference value between the DC component of the transform coefficient and that of a neighboring block is Huffman-coded. The AC component is converted from a low-frequency component to a high-frequency serial component by zig-zag scanning. A set of an invalid component “0” run and a subsequent valid component run is Huffman-coded.
An encoding method aiming at higher compression ratios and higher image qualities is H.264 (MPEG4-Part10 AVC). It is known that H.264 requires larger calculation amounts for encoding and decoding than those in conventional encoding methods such as MPEG2 and MPEG4, but can achieve higher coding efficiencies (see ISO/IEC 14496-10, “Advanced Video Coding”).
FIG. 16 is a diagram showing the arrangement of an image processing apparatus which compresses image data by H.264. In FIG. 16, input image data is divided into macroblocks, which are sent to a subtracter 401. The subtracter 401 calculates the difference between image data and a predicted value, and outputs it to an integer DCT (Discrete Cosine Transform) transform unit 402. The integer DCT transform unit 402 executes integer DCT transform for the input data, and outputs the transformed data to a quantization unit 403. The quantization unit 403 quantizes the input data. The quantized data is sent as difference image data to an entropy encoder 415, while it is inversely quantized by an inverse quantization unit 404, and undergoes inverse integer DCT transform by an inverse integer DCT transform unit 405. An adder 406 adds a predicted value to the inversely transformed data, reconstructing an image.
The reconstructed image is sent to a frame memory 407 for intra (intra-frame) prediction, while it undergoes deblocking filter processing by a deblocking filter 409, and then is sent to a frame memory 410 for inter (inter-frame) prediction. The image in the intra prediction frame memory 407 is used for intra prediction by an intra prediction unit 408. The intra prediction uses the value of a pixel adjacent to an encoded block as a predicted value.
The image in the inter prediction frame memory 410 is formed from a plurality of pictures, as will be described later. A plurality of pictures are classified into two lists “List0” and “List1”. A plurality of pictures classified into the two lists are used for inter prediction by an inter prediction unit 411. After the inter prediction, a memory controller 413 updates internal images. In the inter prediction by the inter prediction unit 411, a predicted image is determined using an optimal motion vector based on the result of motion detection between image data of different frames by a motion detection unit 412.
As a result of intra prediction and inter prediction, a selector 414 selects an optimal prediction result. The motion vector is sent to the entropy encoder 415, and encoded together with the difference image data, forming an output bit stream.
H.264 inter prediction will be explained in detail with reference to FIGS. 17 to 20.
The H.264 inter prediction can use a plurality of pictures for prediction. Hence, two lists (“List0” and “List1”) are prepared to specify a reference picture. A maximum of five reference pictures can be assigned to each list.
P-pictures use only “List0” to mainly perform forward prediction. B-pictures use “List0” and “List1” to perform bidirectional prediction (or only forward or backward prediction). That is, “List0” holds pictures mainly for forward prediction, and “List1” holds pictures mainly for backward prediction.
FIG. 17 shows an example of a reference list used in encoding. This example assumes that the ratio of I-, P-, and B-pictures is a standard one, that is, I-pictures are arranged at an interval of 15 frames, P-pictures are arranged at an interval of three frames, and B-pictures between I- and P-pictures are arranged at an interval of two frames. In FIG. 17, image data 1001 is obtained by arranging pictures in the display order. Each square in the image data 1001 describes the type of picture and a number representing the display order. For example, a picture I15 is an I-picture whose display order is 15, and is used for only intra prediction. A picture P18 is a P-picture whose display order is 18, and is used for only forward prediction. A picture B16 is a B-picture whose display order is 16, and is used for bidirectional prediction.
The encoding order is different from the display order, and data are encoded in the prediction order. In FIG. 17, data are encoded in the order of “I15, P18, B16, B17, P21, B19, B20, . . . ”
In FIG. 17, a reference list (List0) 1002 holds temporarily encoded/decoded pictures. For example, inter prediction using a picture P21 (P-picture whose display order is 21) refers to pictures which have been encoded and decoded in the reference list (List0) 1002. In the example shown in FIG. 17, the reference list 1002 holds pictures P06, P09, P12, I15, and P18.
In inter prediction, a motion vector having an optimal predicted value is obtained for each macroblock from reference pictures in the reference list (List0) 1002, and encoded. Pictures in the reference list (List0) 1002 are sequentially given reference picture numbers, and discriminated (separately from numbers shown in FIG. 17).
After the end of encoding the picture P21, the picture P21 is newly decoded and added to the reference list (List0) 1002. The oldest reference picture (in this case, the picture P06) is deleted from the reference list (List0) 1002. Encoding proceeds in the order of pictures B19, B20, and P24. FIG. 18 shows the state of the reference list (List0) 1002 at this time.
FIG. 19 shows a change of the reference list for each picture.
In FIG. 19, pictures are coded sequentially from the top. FIG. 19 shows a picture during encoding and the contents of the reference lists (List0 and List1) for it. When a P-picture (or I-picture) is encoded as shown in FIG. 19, the reference lists (List0 and List1) are updated to delete the oldest pictures from the reference lists (List0 and List1). In this example, the reference list (List1) holds only one picture. This is because, if the number of pictures referred to for backward prediction increases, the buffer amount until decoding also increases. In other words, backward pictures excessively distant from a picture during encoding are not referred to.
In this example, I- and P-pictures are referred to, and all I- and P-pictures are sequentially added to the reference lists (List0 and List1). Only I-pictures are used in the reference list (List1) for backward prediction because this picture arrangement is considered to be the most popular one. However, the picture arrangement in the reference list is merely an example of the most popular one, and H.264 itself has a high degree of freedom for the configuration of the reference list.
For example, not all I- and P-pictures need be added to the reference list, and B-pictures can also be added to the reference list. Also, H.264 defines a long-term reference list of pictures which stay in the reference list until an explicit instruction is received. FIG. 20 shows a change of the reference list when adding B-pictures to the reference list. When adding B-pictures to the reference list, encoded pictures may be added to the reference list every time all B-pictures are coded.
Compact digital cameras capable of recording a moving image using these encoding methods have also been developed and commercialized. The user can easily view an image using such a digital camera, a personal computer, a DVD player, or the like.
In this situation, it is needed more and more recently to record higher-resolution moving images with a larger number of pixels not only by a compact digital camera but also by a lens-interchangeable digital camera. However, as described above, dust adheres to the surface of an image sensor owing to a variety of factors in the lens-interchangeable digital camera. If a moving image is recorded while dust adheres to the surface of the image sensor, the dust image may always appear at the same position during playback of the moving image.
According to a conventional dust removal method in a lens-interchangeable digital camera, information (e.g., information on the position and size of dust) necessary for dust removal and image data are recorded. An image is loaded later into a personal computer or the like to remove a dust image by image processing. That is, a dust image is captured in recorded image data. As for a still image, dust removal is executed for each still image. To the contrary, as for a moving image, dust removal must be done for all the recording time and is a time-consuming work.
In still image shooting, even when the camera internally executes dust removal, the dust removal processing takes a long time, and when the buffer memory in the camera becomes full, shooting can be temporarily interrupted. However, in moving image recording, image data must be successively captured, so shooting cannot be temporarily interrupted.