1. Field of the Invention
The present invention relates to an information processing apparatus, an information processing method, and a computer program, and, more particularly to an information processing apparatus, an information processing method, and a computer program for executing processing for detecting, by analyzing an image photographed by a camera, a line of sight direction of a person, an animal, or the like included in the photographed image.
2. Description of the Related Art
When a line of sight of a person, a pet such as a dog or a cat, or an animal can be judged from an image acquired by a camera, for example, it is possible to operate a shutter at an instance when the line of sight is directed to the camera. Therefore, it is possible to reduce failures in photographing pictures. When a program for executing processing for such judgment on a line of sight is incorporated in, for example, moving image creation software, it is possible to efficiently sort out moving images, for example, select an image including a person looking to the front out of a large number of photograph data.
When the line-of-sight-judgment-processing execution program is incorporated in an interactive system such as a television conference system, it is possible to perform smooth interaction by performing switching of a camera, setting of a camera direction, zooming, and the like according to a line of sight.
As one of related arts that disclose a technique for performing line-of-sight judgment from image data photographed by a camera, there is line-of-sight direction detection. The light-of-sight direction detection is a technique for estimating which direction a user in a camera is looking. The light-of-sight direction detection is performed by reading a subtle positional relation between positions of the irises. For example, in the technique disclosed in “Passive Driver Gaze Tracking with Active Appearance Models”, T. Ishikawa, S. Baker, I. Matthews, and T. Kanade, Proceedings of the 11th World Congress on Intelligent Transportation Systems, October, 2004 (hereinafter, Non-Patent Document 1), a posture of a detected face is calculated by AAM (Active Appearance Models), positions of the irises are detected from portions of the eyes, a posture of the eyeballs is estimated from a positional relation between the eyes and the irises, and an overall line-of-sight direction is estimated by combining the posture of the eyeballs with the posture of the face.
However, in Non-Patent Document 1, the irises and fine image input are necessary for estimation of a posture of the eyeballs. For this purpose, it is indispensable to use a high-performance high-resolution camera. When a general user performs photographing with a camera set in a position 2 m to 3 m away from the user in a living room of a house, it is difficult to use a general camera having about one million pixels.
To realize the technique disclosed in Non-Patent Document 1, it is necessary to use an expensive camera with a high number of pixels. Further, special processing for improving accuracy such as processing for zoom-photographing a portion of the eyes of a subject to improve accuracy of measurement of iris positions and processing for irradiating an infrared ray on the eyes and increasing the luminance of retina and sphere portions of the eyes to accurately photograph a subject are necessary. Moreover, it is necessary to perform face posture estimation to estimate positions of the eyes and a posture of the eyeballs. Therefore, processing is complicated and errors in the line-of-sight direction detection increase.
In “Line-of-Sight Direction Recognition for an Interactive System”, Toshihiko Yamahata and Shinya Fujie, Image Recognition and Understanding Symposium (MIRU2006), a method of abandoning line-of-sight direction estimation in an analog value and performing light-of-sight direction judgment according to processing for classifying line-of-sight directions into ten classes is disclosed. When the line-of-sight directions are classified in a predetermined range in this way, it is unnecessary to estimate an accurate posture of the eyeballs. It is possible to output a recognition result from an image of an eye portion by applying the PCA (principal component analysis), the LDA (linear discrimination analysis), and the like. As a result, the problem of error dispersion due to recognizer serialization is solved.
However, to dimensionally compress the image of the eye portion with the PCA (principal component analysis) and linearly discriminate the image with the LDA (linear discrimination analysis), it is necessary to solve the problems of classification of line-of-sight directions into ten classes. However, it is difficult to solve this problem robustly (stably).
In “Line-of-Sight Measuring Method based on an Eyeball Shape Model”, Takehiko Ohno, Naoki Takekawa, and Atsushi Yoshikawa (NTT Communications Science Laboratories) Proceedings of the 8th Image Sensing Symposium, pp. 307 to 312, a method for line-of-sight direction estimation by the corneal reflex method is disclosed. This is a method of estimating a line of sight from the pupil center and positions of Purkinje's images. With the method, it is possible to highly accurately estimate a line of sight. Further, since it is possible to estimate a line of sight regardless of a direction of a face, the estimation is not affected by an error of a face posture recognizer.
However, to extract Purkinje's images, light has to be irradiated from a position fixed with respect to a camera. Therefore, a device is complicated. Moreover, there is an individual difference in a relation among Purkinje's images, the pupil center, and a line-of-sight direction, calibration has to be performed every time a person changes.