The present invention relates to a viewpoint position detection apparatus and method for detecting the viewpoint position of a person to be measured and, more particularly, to a viewpoint position detection apparatus and method which can achieve both high-speed processing and high detection precision.
The present invention also relates to a stereoscopic image display apparatus and, more particularly, to an apparatus suitably used when image information is stereoscopically displayed on a display device (display) such as a television, video, computer monitor, game machine, or the like, and can be satisfactorily stereoscopically observed without using special spectacles.
As conventional stereoscopic image observation methods, a method of observing disparity images based on different polarized light states by the right and left eyes using polarized light spectacles, a method of guiding predetermined ones of a plurality of disparity images to the eyeballs of the observer using a lenticular lens, and the like have been proposed.
For example, Japanese Patent Laid-Open No. 09-311294 discloses an apparatus using a rear cross lenticular scheme. FIG. 11 is a perspective view showing principal part of an example of a stereoscopic image display apparatus using the rear cross lenticular scheme. Referring to FIG. 11, reference numeral 6 denotes a display device for displaying an image. The display device 6 comprises, e.g., a liquid crystal element (LCD). In FIG. 11, a polarization plate, color filter, electrodes, black matrix, anti-reflection film, and the like are not shown.
Reference numeral 10 denotes a backlight (surface illuminant) which serves as an illumination light source. A mask substrate (mask) 7 on which a mask pattern having checkered apertures 8 is placed between the display device 6 and backlight 10. The mask pattern is prepared by patterning a metal deposition film such as chromium, light absorbing material, or the like on the mask substrate 7 formed of glass or a resin. The backlight 10, mask substrate 7, and the like are building components of the light source.
First and second lenticular lenses 3 and 4 made of a transparent resin or glass are interposed between the mask substrate 7 and display device 6. The first lenticular lens 3 is a vertical cylindrical lens array constructed by lining up vertical cylindrical lenses, which are elongated in the vertical direction, in the right-and left direction, and the second lenticular lens 4 is a horizontal cylindrical lens array constructed by lining up horizontal cylindrical lenses, which are elongated in the horizontal direction, in the up-and-down direction.
An image to be displayed on the display device 6 is a horizontal stripe image, which is formed by segmenting right and left disparity images R and L into a large number of horizontal stripe pixels R and L in the up-and-down direction, and alternately arranging these pixels from the top of the screen in the order of, e.g., L, R, L, R, L, R, . . . , as shown in FIG. 11.
Light coming from the backlight 10 is transmitted through the apertures 8 of the mask substrate 7 and illuminates the display device 6, and right and left stripe pixels R and L are separately observed by the right and left eyes of the observer.
More specifically, the mask substrate 7 is illuminated with light coming from the backlight 10, and light components emerge from the apertures 8. The first lenticular lens 3 is placed on the observer side of the mask substrate 7, and the lens curvature is designed to locate the mask substrate 7 at nearly the focal point positions of the respective cylindrical lenses. In this section, since the second lenticular lens 4 has no optical effect, a light beam emerging from one point on the aperture 8 is converted into a nearly collimated light in this section.
A pair of aperture and light-shielding portion of the mask pattern are set to nearly correspond to one pitch of the first lenticular lens 3.
By determining the pitch of the first lenticular lens and that of the pair of aperture and light-shielding portion of the mask pattern on the basis of the relationship between the optical distance from a predetermined position of the observer to the first lenticular lens 3 and that from the first lenticular lens 3 to the mask pattern, light leaving the apertures 8 can be uniformly focused on the right or left eye across the total width of the screen. In this manner, the right and left stripe pixels on the display device 6 are separately observed by the right and left eye regions in the horizontal direction.
The second lenticular lens 4 focuses all light beams emerging from the respective points on the apertures 8 of the mask 7 onto the right- or left-eye stripe pixels on the display device 6. The light beams which illuminate and are transmitted through the display device 6 diverge only in the vertical direction in correspondence with NA upon focusing so as to provide an observation region where right and left stripe pixels can be uniformly separately observed from a predetermined eye level of the observer over the total height of the screen.
However, as the field angle of such stereoscopic image display apparatus is narrow, when the viewpoint of the observer falls outside the field angle, stereoscopic display cannot be recognized. For this reason, a technique for broadening the stereoscopic view region by detecting the viewpoint position of the observer and controlling image display in response to movement of the viewpoint position has been proposed. For example, Japanese Patent Laid-Open No. 10-232367 discloses a technique for broadening the stereoscopic view region by moving a mask pattern or lenticular lens parallel to the display surface.
FIG. 12 shows a stereoscopic image display apparatus disclosed in Japanese Patent Laid-Open No. 10-232367. The same reference numerals in FIG. 12 denote the same building components as those in FIG. 11, and a detailed description thereof will be omitted. Since the stereoscopic image display apparatus shown in FIG. 12 uses a single lenticular lens, it does not have the second lenticular lens 4 shown in FIG. 11.
In the stereoscopic image display apparatus with this arrangement, control according to the movement of an observer 54 is done as follows. A position sensor 51 detects any horizontal deviation of the observer 54 from a predetermined reference position, and sends that information to a control unit 52. The control unit 52 outputs an image control signal to a display drive circuit 50 in accordance with this deviation information. The display drive circuit 50 displays a first or second horizontal stripe image on the display 6. At the same time, the control unit 52 generates an actuator drive signal based on the deviation information to drive an actuator 53, which moves the mask pattern 7 in the horizontal direction, thereby moving the mask pattern 7 to the best position where the observer 54 can separate right and left stripe images. As a result, even when the viewpoint position of the observer 54 has changed, a broad stereovision range can be assured.
When display is controlled in accordance with the viewpoint position of the observer, low detection precision and long processing time for detection disturb image display suitable for the viewpoint position of the observer. For this reason, it is very important for the performance of the display apparatus to detect the viewpoint position of the observer with higher precision within a shorter period of time.
As methods for detecting the viewpoint position of the observer (person to be measured), the following methods are available:
1) Method of irradiating observer with infrared light, and detecting light reflected by retina
(Reference 1-a) Banno, xe2x80x9cDesign Method of Pupil Photographing Optical System for Viewpoint Detectionxe2x80x9d, Journal of The Institute of Electronics, Information and Communication Engineers D-II, Vol. J74-D-II, No. 6, pp. 736-747, June, 1991
(Reference 1-b) U.S. Pat. No. 5,016,282
2) Method of detecting eye of observer by image processing of visible image (e.g., Sakaguchi et al., xe2x80x9cReal-time Face Expression Recognition Using Two-dimensional Discrete Cosine Transform of Imagexe2x80x9d, Journal of The Institute of Electronics, Information and Communication Engineers D-II, Vol. J80-D-II, No. 6, pp. 1547-1554, June, 1997)
3) Method of detecting eye of observer by image processing using infrared image and visible image (e.g., Japanese Patent Laid-Open No. 8-287216)
Method 1) exploits the fact that the pupil of a human being recursively reflects near infrared light (returns light in a direction agreeing with the incoming direction). Light reflected by the pupil is obtained as a sharp reflection peak, and normally exhibits higher reflectance than, e.g., a face. Hence, by sensing an image of the observer using an infrared image sensing apparatus in which a light source is coaxial with the optical axis, only the pupil portion image can be sensed to have higher luminance. When the sensed image is binarized by an appropriate threshold value, an accurate viewpoint position can be detected from the extracted pupil position.
In method 2), the observer position within the image sensing range is limited in advance, and the observer is made to blink in that state, thereby extracting the eye region based on inter-frame images of that visible image, and detect eyes using pattern matching with templates generated by said extracted eye region.
In method 3), an infrared image and visible color image are sensed at the same time, and after face regions are extracted from these images, a feature region such as an eye is detected using, e.g., pattern matching. The infrared image is used to extract a person candidate region and to determine a temperature threshold value, which is used upon extracting a flesh tone region from the color image.
However, in method 1), since the observer must be continuously irradiated with relatively intense infrared light, there is a fear of adverse influences of infrared light on the observer. Also, since light reflected by the retina is used, detection is disabled if the observer blinks. Furthermore, when the observer wears spectacles, operation errors readily occur due to light reflected by the spectacles.
In addition, in the method of irradiating the object with infrared light, the intensity of infrared light must be adjusted in accordance with the observation distance, resulting in a complicated mechanism.
Furthermore, owing to dilation/constriction of the pupil depending on the ambient illuminance, and the direction of the line of sight of the person to be measured, the pupil reflected image is hard to track.
In method 2), since the observer is required to adjust his or her observation position and to blink, such method is cumbersome for the observer. Also, in order to prepare templates, the time for adjusting the observation position and making the observer blink is required, resulting in an unpractical method.
Furthermore, in method 3), the irradiation intensity of infrared light can be lower than that in method 1). However, after the intermediate processing result of an infrared image is obtained, a visible image is processed using that processing result, the face region is detected using the processing results of the infrared and visible images and, finally, pattern matching must be done, thus requiring very complicated processes. Also, it is not easy to prepare templates used in pattern matching.
Since face parts positions required for preparing pattern matching templates are detected from the visible image alone, positional precision is not so high.
In addition, as described in, e.g., Japanese Patent Laid-Open No. 2-50145, a method of estimating the viewpoint position of the observer by detecting infrared light reflected by the observer or the temperature of the observer using a plurality of infrared receivers, a method of detecting the position of the observer by placing a light source behind the observer, and measuring the lightness distribution using a light receiver placed in front of the observer, a method of sensing an image of the observer using a TV camera, and detecting the viewpoint by processing the sensed image by an image processing technique, and the like have been proposed. However, none of these methods are satisfactory in terms of processing speed and detection precision.
It is, therefore, an object of the present invention to provide a viewpoint position detection apparatus and method, which can assure high-speed process, high precision, and high tracking performance by a simple arrangement while suppressing the fear of adverse influences on the human body.
It is another object of the present invention to provide a stereoscopic display system having a stereoscopic image display apparatus which controls display using viewpoint position information obtained using the viewpoint position detection apparatus or method of the present invention.
It is still another object of the present invention to provide a stereoscopic image display apparatus which always allows the observer to enjoy normal stereoscopic observation over a broad observation range using a detection mechanism for detecting the viewpoint position with high precision, even when the observer has moved and his or her viewpoint position has changed while he or she is observing a stereoscopic image displayed on a display.
It is still another object of the present invention to provide a stereoscopic image display apparatus which always allows the observer to enjoy normal stereoscopic observation without switching to reversed stereo (pseudostereoscopic image display) and to observe a stereoscopic image in accordance with his or her viewpoint position, when disparity images to be displayed simultaneously consist of two disparity images corresponding to the right and left eyes, and even when the observer has moved and his or her viewpoint position has changed.
It is still another object of the present invention to provide a stereoscopic image display apparatus which can improve user""s convenience by displaying a warning message when the observer is located outside the observation range of a stereoscopic image displayed on a display, and allowing a video camera for detecting the viewpoint position as a TV meeting camera or monitor camera.
More specifically, the gist of the present invention lies in a viewpoint position detection apparatus for detecting a viewpoint position of a person to be measured, and outputting viewpoint position information, characterized by comprising infrared image capturing means for capturing an infrared image of the person to be measured, visible image capturing means for capturing a visible image of the person to be measured, detection means for detecting a pupil position of the person to be measured from the infrared image captured by the infrared image capturing means, template generation means for generating a template for pattern matching using the pupil position from the visible image captured by the visible image capturing means, and matching means for detecting a viewpoint position of the person to be measured by pattern matching with the visible image captured by the visible image capturing means using the template generated by the template generation means, and outputting a result as the viewpoint position information.
Another gist of the present invention lies in a viewpoint position detection apparatus for detecting a viewpoint position of a person to be measured, and outputting viewpoint position information, characterized by comprising infrared image capturing means for capturing an infrared image of the person to be measured, visible image capturing means for capturing a visible image of the person to be measured, detection means for detecting a pupil position of the person to be measured from the infrared image captured by the infrared image capturing means, template generation means for generating a template for pattern matching using the pupil position from the visible image captured by the visible image capturing means, matching means for detecting a viewpoint position of the person to be measured by pattern matching with the visible image captured by the visible image capturing means using the template generated by the template generation means, and outputting a detection result as the viewpoint position information, and control means for controlling to generate the template again using the detection means and the template generation means when a predetermined condition is satisfied.
Still another gist of the present invention lies in a stereoscopic image display system, which has the viewpoint position detection apparatus according to the present invention, and a stereoscopic image display apparatus connected to the viewpoint position detection apparatus, characterized by controlling the stereoscopic image display apparatus using viewpoint position information received from the viewpoint position detection apparatus.
Still another gist of the present invention lies in a viewpoint position detection method for detecting a viewpoint position of a person to be measured, and outputting viewpoint position information, characterized by comprising the infrared image capturing step of capturing an infrared image of the person to be measured, the visible image capturing step of capturing a visible image of the person to be measured, the detection step of detecting a pupil position of the person to be measured from the infrared image captured in the infrared image capturing step, the template generation step of generating a template for pattern matching using the pupil position from the visible image captured in the visible image capturing step, and the matching step of detecting a viewpoint position of the person to be measured by pattern matching with the visible image captured in the visible image capturing step using the template generated in the template generation step, and outputting a result as the viewpoint position information.
Still another gist of the present invention lies in a viewpoint position detection method for detecting a viewpoint position of a person to be measured, and outputting viewpoint position information, characterized by comprising the infrared image capturing step of capturing an infrared image of the person to be measured, the visible image capturing step of capturing a visible image of the person to be measured, the detection step of detecting a pupil position of the person to be measured from the infrared image captured in the infrared image capturing step, the template generation step of generating a template for pattern matching using the pupil position from the visible image captured in the visible image capturing step, the matching step of detecting a viewpoint position of the person to be measured by pattern matching with the visible image captured in the visible image capturing step using the template generated in the template generation step, and outputting a detection result as the viewpoint position information, and the control step of controlling to generate the template again using the detection step and the template generation step when a predetermined condition is satisfied, and repeating the visible image capturing step and the matching step in other cases.
Still another gist of the present invention lies in a computer readable storage medium which stores the viewpoint position detection method according to the present invention as a program that can be executed by a computer.
A stereoscopic image display apparatus according to the present invention has the following characteristic features:
(1-1) In a stereoscopic image display apparatus which stereoscopically observes disparity images using a viewpoint detection apparatus for detecting a viewpoint of an observer, and a display device for displaying disparity images corresponding to right and left eyes of the observer while controlling the images to track viewpoint information,
the viewpoint detection apparatus is characterized by having:
image sensing means for capturing an image of an observer as video information;
video processing means having a function of detecting a face region from the video information of the observer captured by the image sensing means, and detecting two eyes of the observer from the face region, and a function of tracking the detected two eyes; and
camera control means for enlarging or reducing the face region detected by the video processing means.
Especially, the stereoscopic image display apparatus has the following characteristic features:
(1-1-1) the image sensing means has a video camera, and the camera control means has a mechanism for panning/tilting the video camera;
(1-1-2) the apparatus further comprises signal switching means for externally outputting a video signal from the image sensing means and a zoom/pan/tilt control signal from the camera control means;
(1-1-3) the video processing means identifies predetermined color information from the captured video information of the observer;
(1-1-4) the predetermined color information is a face tone of the observer or a standard flesh tone;
(1-1-5) when the video processing means identifies the predetermined color information, and when a region corresponding to the color is not detected from the captured video information, a focal length of the video camera is controlled to a short focal length side, and when the region is detected, the focal length of the video camera is controlled to a predetermined focal length;
(1-1-6) the apparatus further comprises alarm means for generating an alarm to the observer when the video processing means identifies the predetermined color information, and when a region corresponding to the color is not detected from the captured video information;
(1-1-7) the video processing means identifies a predetermined pattern region from the captured video information of the observer;
(1-1-8) the predetermined pattern is an eye of the observer, a standard eye, a vicinity of an eye, or a partial image that forms an eye such as an iris or the like;
(1-1-9) the apparatus further comprises video processing means for generating the predetermined color or pattern from face image information of the observer, and video recording means for recording the generated information;
(1-1-10) the apparatus further comprises switching means for displaying face image information of the observer captured by the video camera on a display unit;
(1-1-11) the apparatus further comprises operation means for allowing the observer to manually set the face image of the observer displayed on the display unit at a predetermined position and a size on a display screen;
(1-1-12) the video processing means tracks a specific pattern by pattern recognition; and
(1-1-13) the specific pattern is an eye of the observer, a standard eye, a vicinity of an eye, or a partial image that forms an eye such as an iris or the like, and the apparatus further comprises alarm means for generating an alarm when a spacing between two eyes (captured from the video camera) is other than a prescribed value upon tracking the two eyes.
(1-2) A stereoscopic image display apparatus having a display device which includes an optical modulator having a discrete pixel structure, a mask pattern formed on a display surface of the optical modulator by aligning a plurality of transmitting and intercepting portions at a predetermined pitch in horizontal and vertical directions, light source means for irradiating the optical modulator with light, a display which has a discrete pixel structure and displays a synthesized disparity image using scanning lines, and which irradiates disparity images displayed on the display with a light beam patterned by the mask pattern, guides light beams based on the disparity images to right and left eyes of an observer, and allows the observer to stereoscopically observe image information displayed on the display, and a viewpoint detection apparatus for detecting viewpoint information of the observer, is characterized in that the synthesized disparity image is formed by two original disparity images corresponding to the right and left eyes, and a pattern shape of the mask pattern and original disparity images that form the synthesized disparity image are switched and displayed on the basis of the viewpoint information from the viewpoint detection apparatus.
Especially, the stereoscopic image display apparatus has the following characteristic features:
(1-2-1) the two original disparity images that form the synthesized disparity image are images observed from a viewpoint corresponding to a distance between eyes; and
(1-2-2) a horizontal element of each transmitting portion of the mask pattern of the optical modulator is composed of a plurality of pixels, and a stripe irradiated region to be projected at an observation position is controlled upon being segmented into a plurality of regions.
A stereoscopic image display method of the present invention is characterized by including:
(2-1) the step of capturing an image of an observer who is observing a stereoscopic image based on disparity images displayed on a display as video information; the step of detecting a face region of the observer on the basis of the video information of the observer; the step of detecting eyeballs of the observer from the face region of the observer; the step of tracking the eyeballs of the observer; the step of detecting viewpoint information of the observer from the detected eyeballs of the observer; and the step of controlling to track the disparity images to be displayed on the display on the basis of the viewpoint information of the observer.
Especially, the stereoscopic image display method has the following characteristic features:
(2-1-1) the method further comprises the step of identifying predetermined color information from the captured video information of the observer;
(2-1-2) the method further comprises the step of changing a capturing method of the video information of the observer when predetermined color information is not present in the captured video information of the observer; and
(2-1-3) the method further comprises the step of generating an alarm signal when the predetermined color information is not present in the captured video information of the observer.
Other features and advantages of the present invention will be apparent from the following description taken in conjunction with the accompanying drawings, in which like reference characters designate the same or similar parts throughout the figures thereof.