Eyesight, which is one of the five senses for obtaining information on a surrounding environment, may perceive a location and perspective of an object through a viewer's two eyes. That is, visual information obtained by the two eyes is synthesized as single distance information to make people freely move.
By implementing the above-mentioned visual structure in a machine, a robot capable of replacing a human has been developed. In this regard, a visual system of such robot includes a stereo camera (i.e., a left camera and a right camera) and reads images from the two cameras to reconstruct the read images as three-dimensional image information.
In this case, an image can be obtained by projecting a three-dimensional space onto a two-dimensional space. In this process, because three-dimensional distance information (e.g., depth) is lost, it is difficult to directly restore the three-dimensional space from the two-dimensional space. However, if two or more images obtained from different positions are present, the three-dimensional space may be restored. That is, in a case where one point on real space focuses on the two images, by finding a corresponding point located in the two images and using a geometrical structure, a position in real space of the one point may be found.
Although finding the corresponding point in the two images (hereinafter, referred to as “stereo matching”) is can be difficult work, it can also be an important technology for estimating and restoring the three-dimensional space. The stereo matching may produce many forms of results, and thus, it can be difficult to find only the result representing the real space among all results.
The stereo matching technology is generally based on a markov random field (MRF) model. The two-dimensional field model transforms a modeling of complex object into a simple and region-related probability model, but the MRF model performs complex computations and results in uncertain boundaries.
Meanwhile, as stereo matching technology is based on a high-speed processing, dynamic programming technology has a trellis structure, and may perform a significantly more rapid and accurate stereo matching. However, because the dynamic programming technology performs the matching through a search only in a single scan line, it does not consider the results of lower and upper columns, thereby causing significant stripe noise. Further, the stereo matching of all columns can be independently performed, such that the results of upper and lower rows are different from the result of a current row.
In order to reduce the above-mentioned noise, a median filter, an average value filter, or the like are used in an image processing. But because it is used for each frame, association with a previous frame is not considered. Therefore, due to the noise which is changed for each frame, it becomes difficult to extract a stable depth image.