In recent years, with, for example, an improvement in processing capacity of an arithmetic circuit and an increase in number of pixels in a display region, some kinds of display devices, that is, so-called three-dimensional image display devices (to be referred to as 3D televisions hereinafter) capable of allowing the observer to perceive three-dimensional visual effects are available.
A 3D television often employs, for example, a method of selectively displaying images for left and right eye at a predetermined refresh rate, or simultaneously displaying left- and right-eye images to allow the observer to observe different images with his or her both eyes using an optical member such as a lenticular film. The 3D television displays images or videos represented by images for left and right eye having a given difference (disparity), which allows the observer to perceive depth information.
When humans observe an object upon the occurrence of a disparity as the angle formed by a straight line which connects a subject to each eyeball is larger than the convergence angle formed by the line of sight which connects this eyeball to a gaze point (no disparity), they perceive the object as being in the foreground with respect to the gaze point. However, when humans observe an object upon the occurrence of a disparity as the angle formed by a straight line which connects a subject to each eyeball is smaller than the convergence angle formed by the line of sight which connects this eyeball to a gaze point, they perceive the object as being in the background with respect to the gaze point.
Nowadays, to keep pace with the spread of domestic 3D televisions, domestic image capturing apparatuses (3D cameras) such as digital cameras and digital video cameras have already been put on the market. Under the circumstances, the user has become able to display and browse on a domestic 3D television images or videos captured by himself or herself.
As 3D cameras capable of capturing images for binocular stereopsis, not only a 3D camera which includes two imaging optical systems, for left and right eye, but also that which uses one imaging optical system is available. More specifically, light beams having passed through different regions in the exit pupil of one imaging optical system are independently captured so as to obtain an image equivalent to that for binocular stereopsis obtained by an image capturing apparatus including two imaging optical systems having, as its base-line length, the distance between the centers of gravity of regions through which light beams have passed. This can be achieved by using an image sensor having a composite pixel structure (see FIG. 2) which is used for focus detection of the phase difference detection scheme, includes a plurality of light-receiving elements in each pixel, and forms images of different light beams on the respective light-receiving elements using a microlens, as described in Japanese Patent No. 4027113.
In focus detection of the phase difference detection scheme, light beams having passed through different regions in the exit pupil form, on different pixels, an image of a subject at a focal position and those of subjects in the background and foreground, respectively, with respect to the focal position, as shown in FIGS. 12A to 12C.
When the composite pixel structure of the image sensor includes two horizontally arranged light-receiving elements, the shift in horizontal position of a subject image, which is generated between images A and B output from the light-receiving elements of all pixels, varies for subject images at respective distances, as shown in FIGS. 12D to 12F. In an actual focus detection operation, the outputs from light-receiving elements a and b are used in combination in the column direction (or row direction) as those from pixel cell groups each having the same color to create images A and B and convert them into data, thereby obtaining a shift in corresponding point between images A and B by correlation calculation.
In this manner, when an image sensor having a composite pixel structure is used to create an image for binocular stereopsis from light beams having passed through different regions in the exit pupil of an imaging optical system, a plane in which a gaze point is present in an image capturing apparatus including two imaging optical systems corresponds to a focal position having almost no disparity. That is, when the image capturing apparatus is focused on a main subject to shoot an image using one imaging optical system, the main subject is imaged on almost the same pixels in an image for left eye (image A) and an image for right eye (image B), and naturally has no disparity.
When such an image for binocular stereopsis shot by an image capturing apparatus (monocular image capturing apparatus) including one imaging optical system and an image sensor having a composite structure is displayed, it is hard for the observer to perceive a stereoscopic effect of a main subject because the main subject always has no disparity.
Also, when moving images shot while focusing on a main subject or continuously shot still images are sequentially read out and reproduced on a display device, an image associated with the main subject has no disparity even after the main subject moves, so the disparity of a subject absent at a focal position changes.
In, for example, scene 1, when an image capturing apparatus, a main subject, a near subject, and a far subject have positional relationships as shown in FIG. 13A, an image for left eye 1301 and an image for right eye 1302 are shot as shown in FIG. 13B. As can be seen from the images for left and right eye 1301 and 1302, the near subject in the foreground with respect to the main subject at a focal position, and the far subject in the background with respect to the main subject have horizontal differences (Zn and Zf1) between these images. When such an image for binocular stereopsis is displayed on a display device, the main subject at the focal position has no difference between the captured images for left and right eye, so the observer perceives it as being at the position of the display surface, as shown in FIG. 13C. Also, the observer perceives the near subject and far subject as being in the foreground and background, respectively, with respect to the display surface.
Then, in scene 2, when the main subject moves to the same depth position as that of the near subject while the near subject and far subject stand still, as shown in FIG. 13D, an image for left eye 1303 and an image for right eye 1304 are shot as shown in FIG. 13E. As described above, upon focusing on the main subject, a subject on a plane in which a focal position is present has no disparity, so the near subject and main subject have no differences between the captured images for left and right eye. In contrast to this, the far subject moves away from the focal position in the depth direction, and therefore has a difference Zf2 between the captured images for left and right eye, which is larger than the difference Zf1 between these images in scene 1.
At this time, when the images for left and right eye 1303 and 1304 are displayed on the display device, the observer perceives the main subject and near subject as standing still at the positions defined on the display surface, as shown in FIG. 13F, so it is hard for the observer to perceive the movement of the main subject. That is, the near subject stands still, but nonetheless the observer perceives it as having moved to the position of the display surface where the main subject is present. Also, the far subject stands still, but nonetheless the observer perceives it as having moved to the background with respect to the display surface where the main subject is present.