1. Field
Embodiments relate to a technology for estimating a pose of an object, and more particularly, to an apparatus and method for estimating a continuous pose of an object.
2. Description of the Related Art
Object pose estimation is of significant importance in computer vision, human-machine interaction, and other fields. In a case of a head of a user being an object to be estimated, individualized information desired by the user may be identified through estimation of a continuous pose of the head. For example, a content of a speech and emotion of a speaker may be obtained from a pose of a head of the speaker. The estimated pose of the object may be used to facilitate human-machine interaction. For example, an increase in effectiveness Of human-machine interaction may be achieved based on a point of gaze obtained by estimating a pose of a head.
Conventionally, an estimating a pose of an object may include a tracking-based method and a training-based method. The tracking-based method estimates a pose of an object through pair matching of a current frame and a previous frame in a video sequence. The tracking-based method has an advantage of relatively accurate estimation of a pose over a short time, but a tracking drift caused by accumulated errors may occur, and when an error in feature matching occurs due to a wide range of rotation or a high velocity of an object, object tracking may fail. Accordingly, a key frame may be used to eliminate or reduce the tracking drift. However, reasonable selection and updating of a key frame may be difficult.
The training-based method may be defined to be object pose estimation through classification or regression. The training-based method estimates a pose of an object based on a training model obtained by training a sample including a label. The training-based method has a disadvantage of failing to obtain an accurate estimate because classification involves rough pose estimation and regression may be susceptible to a real environment.
While accurate object pose estimation is desired, obtaining a pose of an object continuously and stably using a computer vision method has been difficult. In particular, when a rotation range or a velocity of an object increases due to an abrupt change in illumination, object pose estimation has not been effective.