1. Field of the Invention
The present invention relates to a hand pointing apparatus, and more specifically to a hand pointing apparatus for picking up a person to be recognized from a plurality of different directions and for determining the coordinates of a specific position pointed to by the person to be recognized.
2. Description of the Related Art
There has been heretofore known a hand pointing input apparatus which comprise a display for displaying predetermined information, an illumination device for illuminating an information inputting person who comes to the display, and a plurality of image pickup means for picking up the image of the approaching information inputting person from different directions, wherein a plurality of image pickup means image pickup images of situations where the approaching information inputting person points with a finger or the like to an optional position on the display, the information inputting person is recognized in accordance with a plurality of images obtained by the image pickup, the position on the display pointed to by the information inputting person is determined, a cursor or the like is displayed on the position pointed to on the display, and the position on the display pointed to is recognized as being clicked at the time of detecting the fact that the information inputting person has performed a clicking action by raising a thumb, whereby a predetermined processing is performed (see, for example, Japanese Patent Application Laid-Open (JP-A) Nos. 4-271423, 5-19957, and 5-324181 or the like).
According to the above-described hand pointing input apparatus, since the information inputting person can give various instructions to an information processing apparatus and input various information to the information processing apparatus without touching an input device such as a keyboard or a mouse, it is possible to simplify the operation for using the information processing apparatus.
The above-described hand pointing input apparatus can determine which position on the display screen of the display the information inputting person is pointing to. However, if the displayed image on the display is an image spuriously representing a 3-D (three-dimensional) space, this hand pointing input apparatus cannot determine which position within a virtual 3-D space represented by the image (i.e., the 3-D coordinates of a position pointed to by the information inputting person) the information inputting person is pointing to. As to spuriously displaying an image representing a 3-D space, apart from displaying images in conformity with a one-point perspective or two-point perspective on a planar display, various methods have been also provided in which images can be displayed on a 3-D display using a liquid-crystal shutter or a lenticular lens, stereographic images can be displayed by applying holographic technology, and the like (these images are referred to as three-dimensional images hereinafter). However, there has been a drawback in that three-dimensional images such as those described above cannot be used as an object pointed to by the information inputting person.
Further, the above-described drawback is not limited to the case in which the object pointed to by the information inputting person is a virtual 3-D space represented by a three-dimensional image. It is also impossible to determine the position within a 3-D space the information inputting person is pointing to, even when the object pointed to by the information inputting person exists within an actual 3-D space.
In view of the aforementioned, it is an object of the present invention to provide a hand pointing apparatus which can determine 3-D coordinates of a position which is pointed to by an information inputting person even when the person is pointing to an arbitrary position within a 3-D space.
In order to accomplish the aforementioned object, the first aspect of the present invention is a hand pointing apparatus, comprising: image pickup means which picks up the image of a person to be recognized from a plurality of different directions; computing means which extracts an image which corresponds to the person to be recognized on the basis of a plurality of images obtained by the image pickup means which picks up images from a plurality of direction of the person to be recognized pointing to a specific position within 3-D coordinates, and determines the 3-D coordinates of a characteristic point whose position may be changed by the person to be recognized bending or stretching an arm, and a reference point whose position does not change even when the person to be recognized bends or stretches the arm; and determining means for determining the direction in which the specific position exists within the 3-D space on the basis of the direction from the reference point to the characteristic point and for determining the location of the specific position within the 3-D space along the depth direction thereof on the basis of the distance between the reference point and the characteristic point, and thereby determines the 3-D coordinates of the specific position within the 3-D space.
In the first aspect of the present invention, an image of the person to be recognized (the information inputting person) is picked up by the image pickup means, from a plurality of different directions. The image pickup means may be structured so that the image of the person to be recognized is picked up from a plurality of directions using a plurality of image pickup apparatuses which are comprised of video cameras or the like. It can also be structured such that light reflecting means such as a plane mirror or the like is provided at the image pickup means, and an image of the person to be recognized is picked up directly by a single image pickup apparatus, and the image of the person to be recognized is picked up from a plurality of directions by picking up virtual images of the person to be recognized which are projected onto the plane mirror.
Further, the computing means extracts an image portion which corresponds to the person to be recognized on the basis of a plurality of images picked up from a plurality of directions using the image pickup means, wherein the person to be recognized is pointing to a specific position within a 3-D space, and determines the 3-D coordinates of a characteristic point whose position may be changed by the person to be recognized bending or stretching an arm, and a reference point whose position does not change even when the person to be recognized bends or stretches an arm. For example, a point which corresponds to the tip of the hand, finger or the like of the person to be recognized or the tip of a pointing apparatus which is grasped by the person to be recognized can be used as a characteristic point. A point which corresponds to the body of the person to be recognized (e.g., the breast or the shoulder joint of the person to be recognized) can be used as a reference point. The 3-D space may be a virtual 3-D space represented by a three dimensional image such as an image which is formed in conformity with a one-point perspective method or two-point perspective method on a planar display, an image which uses a liquid crystal shutter or a lenticular lens on a 3-D display, or a stereographic image which is displayed by applying holographic technology, or the 3-D space may be an actual 3-D space.
The determining means for determining the direction in which the specific position exists within the 3-D space, on the basis of the direction from the reference point to the characteristic point, determines the position of the specific position within the 3-D space along the depth direction thereof on the basis of the distance between the reference point and the characteristic point, and thereby determines 3-D coordinates of the specific position within the 3-D space.
Accordingly, the person to be recognized carries out the operation for adjusting the direction of the characteristic point (i.e., the operation of pointing the hand, finger, or tip of a pointing device of the person to be recognized towards a specific position) with respect to the reference point such that the direction from the reference point to the characteristic point corresponds to the direction in which the specific position exists as the object pointed to, as seen from the point of view of the person to be recognized. Additionally, the person to be recognized carries out the operation for adjusting the distance between the reference point and the characteristic point (i.e., the operation of bending or stretching the arm by the person to be recognized) in accordance with the distance between the specific position and the person to be recognized (i.e., how near to or far from the person to be recognized) so that the direction in which the specific position exists within the 3-D space and the position of the specific position exists within the 3-D space along the depth direction thereof can be determined, and the 3-D coordinates of the specific position can be determined within the 3-D space on the basis of the results of the determining of the aforementioned direction and the position of the specific position in the depth direction thereof.
In accordance with the first aspect of the present invention, when the information inputting person (the person to be recognized) points to an arbitrary position within the 3-D space, the 3-D coordinates of the position pointed to can be determined. Further, the action of pointing a hand or finger or the tip of a pointing device towards the direction where the specific position exists as seen by the person to be recognized and bending or extending an arm to cover the distance to the specified position by the person to be recognized is an extremely natural action for pointing to a specific position inside a 3-D space. Accordingly, the information inputting person (the person to be recognized) can carry out the above-described action without any annoyance involved with this action.
As described in the second aspect of the present invention, the determination of the location of the specific position along the depth direction of the 3-D space on the basis of the distance between the reference point and the characteristic point can be effected by converting the distance between the reference point and the characteristic point into the distance between the person to be recognized and the specific position according to a predetermined conversion conditions. The aforementioned conversion conditions can be conversion characteristics in which the distance between the person to be recognized and the specific position may vary linearly or non-linearly to correspond to the change of the distance between the reference point and the characteristic point.
Especially when the 3-D space serving as an object which is pointed to by the person to be recognized is a space whose depth is very long (e.g., when a three dimensional image which represents the universe is displayed on the display means), if the conversion characteristics of the conversion conditions are made non-linear (conversion characteristics or the like which cause the distance between the person to be recognized and the specific position to vary in proportion to the number raised to the nth (nxe2x89xa72) power of the change in the distance between the reference point and the characteristic point), it allows the person to be recognized to point to a position located at an extreme distance within the 3-D space as seen from the person to be recognized without carrying out exaggerated actions such as the stretching or bending of an arm beyond what is normal. As a result, it is preferable because the person to be recognized can be prevented from being burdened by the action of pointing to an arbitrary position within a 3-D space.
Further, because the physique of the person to be recognized (especially, the length of the arm) is not fixed, the width of the movement of the characteristic point when the person to be recognized bends or stretches an arm is different for each person. When the distance between the reference point and the characteristic point is converted into the distance between the person to be recognized and the specific position in accordance with fixed conversion conditions, the case in which the location of the specific position which is pointed to by the person to be recognized cannot be determined accurately due to variables such as the individual lengths of the arms of the persons to be recognized or the like can be thought of.
For this reason, in accordance with the third aspect of the present invention, there is provided a hand pointing apparatus according to the second aspect of the present invention, further comprising: conversion conditions setting means which requests the person to be recognized to carry out the arm bending or stretching action, and sets in advance the conversion conditions which convert the distance between said reference point and the characteristic point into the distance between the person to be recognized and the specific position on the basis of the extent of the change in the distance between the reference point and the characteristic point when the person to be recognized carries out the arm bending or stretching action.
According to the third aspect of the present invention, since the conversion conditions which convert the distance between the reference point and the characteristic point into the distance between the person to be recognized and the specific position is set in advance on the basis of the extent of the change in the distance between the reference point and the characteristic point (the extent of the change in the distance may vary due to the individual lengths of the arms of the persons to be recognized or the like) when the person to be recognized carries out the arm bending or stretching action, conversion conditions can be obtained in accordance with the physique of each of the person to be recognized. By carrying out the aforementioned conversion using these conversion conditions, in spite of the variables arising from the individual physiques of the persons to be recognized, it is possible to accurately determine the location of the specific position along the depth direction thereof which is pointed to by the person to be recognized. Moreover, for example, in order to inform the hand pointing apparatus of the fact that the person to be recognized is pointing to an extremely remote position within the 3-D space, the person to be recognized whose physique is especially small does not need to carry out any exaggerated actions beyond what is normal.
In accordance with the third aspect of the present invention, a single position which is located in the intermediate portion of a 3-D space along the depth direction thereof, or a plurality of portions whose locations are different from each other within the 3-D space in the depth direction thereof (the location of position in the 3-D space along the depth direction thereof is already known) is pointed to by the person to be recognized so that conversion conditions (the configuration of a conversion curve) can be set on the basis of the distance between the reference point and the characteristic point at this time.
In accordance with the fourth aspect of the present invention, there is provided a hand pointing apparatus according to the first aspect of the present invention, further comprising: display means which displays a three dimensional image; display control means which displays the three dimensional image on the display means; and determining means which determines whether the hand of the person to be recognized is in a specific shape, wherein the three dimensional image is an image which represents a virtual 3-D space, and includes an image which is formed confirming to one-point perspective method or two-point perspective method, an image which uses a liquid crystal shutter or a lenticular lens, and a stereographic image which is displayed by applying holographic technology, and the person to be recognized points to a specific position within the virtual 3-D space which is represented by the three dimensional image which is displayed on the display means, and in a state in which the hand of the person to be recognized is determined to be in a specific shape, when the distance between the reference point and the characteristic point changes, the display control means controls the display means such that the three dimensional image displayed on the display means is displayed so as to be enlarged or reduced according to the change in the distance between the reference point and the characteristic point. The fourth aspect of the present invention is structured such that a three dimensional image is displayed on the display means, and the person to be recognized points to the specific position within a virtual 3-D space which is represented by the three dimensional image. A plane display for displaying an image which is formed conforming to a one-point perspective method or two-point perspective method, a 3-D display which uses a liquid crystal shutter or a lenticular lens, and a display apparatus which displays a stereographic image formed through holographic technology can be used for the display means. As described above, when a three dimensional image is displayed on the display means, because the display magnification can be set arbitrarily, it is preferable that the display magnification can be changed by an instruction from the person to be recognized. However, it is necessary to distinguish the action in which the person to be recognized changes the display magnification from the action in which the person to be recognized points to the specific position within the 3-D space.
In the fourth aspect of the present invention, there is provided determining means which determines whether the hand of the person to be recognized is in a specific shape. In a state in which the hand of the person to be recognized is determined to be in a specific shape, when the distance between the reference point and the characteristic point has changed, the display control means controls the display means such that the three dimensional image displayed on the display means is displayed so as to be enlarged or reduced according to the change in the distance between the reference point and the characteristic point. It is desirable that the aforementioned specific shape be easily determined. For example, a hand shape in a state in which the fingers are stretched out so as to open the hand can be used.
Accordingly, in a state in which the hand of the person to be recognized is in a specific shape (a state in which the fingers are stretched out so as to open the hand), when the action of changing the distance between the reference point and the characteristic point (the bending or stretching action of the arm of the person to be recognized) is carried out, this action is distinguished from the action of pointing to the specific position, and the display magnification of a three dimensional image which is displayed on the display means is changed due to the change in the distance between the reference point and the characteristic point so as to be enlarged or reduced. As a result, in accordance with the fourth aspect of the present invention, the display magnification of the three dimensional image displayed on the display means can be reliably changed due to the variation of the display magnification of the three dimensional image which is instructed by the person to be recognized.
The aforementioned display magnification can be changed linearly or non-linearly in response to the change in the distance between the reference point and the characteristic point.
As described above, in the structure in which the 3-D coordinates of the specific position within the 3-D space, which is pointed to by the person to be recognized, preferably, a predetermined process can be executed by the person to be recognized carrying out a specified action (a so-called click action). However, depending upon the direction of the image pickup by the image pickup means, the action of lifting a thumb which has been conventionally adopted as the click action cannot be detected. Further, the degree of freedom in the action of lifting a thumb is very limited as an action. It is also difficult to provide a plurality of meanings for the click action in the same manner as a right click or a left click on a mouse and select a process which is executed by a click action.
For this reason, in accordance with the fifth aspect of the present invention, there is provided a hand pointing apparatus according to the first aspect of the present invention, further comprising: processing means which detects the speed at which the distance between the reference point and the characteristic point changes, and executes a predetermined process when the detected speed at which the distance between the reference point and the characteristic point changes is greater than or equal to a threshold value.
In the fifth aspect of the present invention, if the person to be recognized carries out a quick arm bending or stretching action, the distance between the reference point and the characteristic point changes at a speed which is greater than or equal to a threshold value. By using this changing speed as a trigger, a predetermined process is carried out. Moreover, the processing means can execute a process which is relevant to the specific position which is pointed by the person to be recognized when the distance between the reference point and the characteristic point has changed at the speed which is greater than or equal to the threshold value.
The present invention determines the 3-D coordinates of the specific position which is pointed to by the person to be recognized on the basis of the positional relationship between the reference point and the characteristic point. However, in accordance with the fifth aspect of the present invention, because it is determined whether a predetermined process has been instructed to be executed on the basis of the change of each of the positions of the reference point and the characteristic point, the image pickup direction by the image pickup means can be fixed so that the reference point and the characteristic point can reliably be detected without considering the lifting or lowering action of a finger. As a result, in accordance with the fifth aspect of the present invention, the action by which a predetermined process is instructed by the person to be recognized (the action of quickly bending or stretching the arm of the person to be recognized) can be detected reliably.
There are two types of directions in which the distance between the reference point and the characteristic point changes (the direction in which the distance increases and the direction in which the distance decreases). Accordingly, as described in the sixth aspect of the present invention, when the distance between the reference point and the characteristic point increases at the speed of the change which is greater than or equal to the threshold value, the first predetermined process can be executed, and when the distance between the reference point and the characteristic point decreases at the speed of the change which is greater than or equal to the threshold value, the second predetermined process which is different from the first predetermined process can be executed.
In the sixth aspect of the present invention, the first predetermined process is executed when the person to be recognized carries out the quick action of stretching an arm (in this case, the distance between the reference point and the characteristic point increases at the changing speed which is greater than or equal to the threshold value). The second predetermined process is executed when the person to be recognized carries out a quick action of bending an arm (in this case, when the distance between the reference point and the characteristic point decreases at the changing speed which is greater than or equal to the threshold value). Accordingly, it becomes possible for the person to be recognized to select one of the first and the second processes as in the right click and the left click on the mouse. By performing one of the aforementioned actions, it is possible to reliably execute a process selected from the first predetermined process and the second predetermined process by the person to be recognized.
As described above, because the physique of the person to be recognized is not universal, and accordingly, the muscular strength or the like of the person to be recognized is not universal, even when the person to be recognized carries out a quick arm bending or stretching action in order to cause the processing means to execute a predetermined process, the speed at which the distance between the reference point and the characteristic point changes is different for every individual person to be recognized. As a result, even if the person to be recognized carries out a quick arm bending or stretching action in order to cause the processing means to execute a predetermined process, the bending or stretching action cannot be detected. Instead, it may even be possible that the aforementioned action is detected although the person to be recognized has not carried out this action.
For this reason, the seventh aspect of the present invention is a hand pointing apparatus according to the fifth aspect of the present invention, further comprising: threshold value setting means which requests the person to be recognized to carry out the arm bending or stretching action to cause the processing means to execute the predetermined process, and thereby sets the threshold value in advance on the basis of the speed at which the distance between the reference point and the characteristic point changes when the person to be recognized carries out the arm bending or stretching action.
In the seventh aspect of the present invention, because a threshold value for determining whether the processing means executes a predetermined process is set in advance on the basis of the speed at which the distance between the reference point and the characteristic point changes when the person to be recognized carries out the arm bending or stretching action in order to execute a predetermined process by the processing means, a threshold value according to the physique, the muscular strength or the like for each person to be recognized can be provided. By determining whether a predetermined process is instructed to be executed on the basis of the threshold value, in spite of variations in the physique, the muscular strength or the like for each person to be recognized, it can be reliably detected that the person to be recognized has given an instruction to execute a predetermined process. As a result, a predetermined process can be executed reliably.