1. Field of Invention
The invention relates to an apparatus and method for image-based rendering. In particular, the invention relates to an apparatus and method for tracking camera motion using a light field in order to provide data for later reconstruction of a three-dimensional structure or environment.
2. Description of Related Art
The computer graphics industry has become increasingly interested in methods by which three-dimensional objects or environments can be represented using a large collection of two-dimensional images. This is referred to as image-based rendering. One way in which to represent an object or environment is to use light fields.
A light field is any representation of a three-dimensional scene by the radiance induced on the set of incident lines in three-dimensional free space R3. This includes, but is not limited to, representations which sample and store this radiance as a collection of two-dimensional images and representations which sample and store this radiance in a data structure representing lines in three-dimensional free space R3.
For example, If the two-dimensional images of the object or environment are sampled densely at a two-dimensional set P of camera positions, the radiance seen along all the lines passing through P has been sampled. These radiances can be stored, for example, in a four-dimensional array where each element corresponds to a line in three-dimensional free space R3. Any image of the object from a three-dimensional set of positions can be reconstructed by collecting the appropriate lines from this array. Such a representation is referred to as a light field.
A light field is best understood with reference to FIG. 1 and the following explanation. FIG. 1 shows a light slab representation of a light field. Consider the set of lines passing through two parallel planes P1 and P2 in the three-dimensional free space R3. Each pair of points p1 of the plane P1 and P2 of the plane P2 defines a unique line. Each line, except for those parallel to P1 and P2, defines a unique pair of the points p1 and P2. So the set of all lines in the three-dimensional free space R3 is a four-dimensional space, which can be parameterized by the four coordinates (u,v,s,t) required to specify the points p1 and p2.
The lines through a specific point p in R3 form a two-dimensional subset, which is a plane under this parameterization. An image is a rectangular subset of this xe2x80x9cplane of linesxe2x80x9d with p as the focal point. space R3 is a four-dimensional space, which can be parameterized by the four coordinates (u,v,s,t) required to specify the points p1 and p2.
The lines through a specific point p in R3 form a two-dimensional subset, which is a plane under this parameterization. An image is a rectangular subset of this xe2x80x9cplane of linesxe2x80x9d with p as the focal point.
Light fields such as these can be used to reconstruct images by collecting a subset of the four-dimensional space of lines which contain a lot of images of an object. This collection is done by pointing an image recording device at the object and physically scanning the image recording device across a two-dimensional square. A two-dimensional set of lines is collected at each image recording device position. The lines in all the images are parameterized using two parallel planes (one containing the moving camera and the other in front of the object) and stored as a four-dimensional array. The radiance seen along each line in the four-dimensional array is addressed by the line coordinates. An image of the object with the focal point anywhere in the three-dimensional vicinity of the two planes can be extracted by collecting the radiance of each of its lines from this array. In this way, images can be reconstructed in real time.
These light fields can be compressed to a reasonable size and accessed quickly enough to make them useful for real-time rendering on a high-end machine. However, the capturing of these light fields has been limited to mechanically scanning a camera over a two-dimensional plane using such devices as a computer controlled camera gantry. Such a device is disclosed in Levoy et al., xe2x80x9cLight Field Renderingxe2x80x9d,Computer Graphics Proceedings, SIGGRAPH ""96, p. 528, 31-42. In these devices, the computer keeps track of the camera position at each frame of the light field capturing process. Such devices are generally expensive, limited to a specific area and range of object sizes it can handle, and require large investments of time and money to produce and use.
Another way in which to capture the light fields is to use fiducial points, points in three-dimensional free space R3 whose exact locations are known with respect to some coordinate frame. Gortler et al., xe2x80x9cThe Lumigraphxe2x80x9d, Computer Graphics Proceedings, SIGGRAPH ""96, p. 528, 43-54, discloses one method of using fiducial points to obtain images of a three-dimensional environment. In this method, however, it is necessary to have many fiducial points within the environment and to maintain those fiducial points in the image field while capturing each image. Thus, the range of motion of the camera is limited to the area in which a number of fiducial points are present.
Therefore, it would be beneficial and more practical to be able to use a hand-held camera to perform the image capturing of a three-dimensional environment without being constrained to a particular area of movement, i.e. an area containing fiducial points. However, a fundamental difficulty in using a hand-held device is in tracking the position and orientation of the camera at each frame of the image capturing process.
Camera tracking is a problem that arises frequently in computer vision. One technique for camera tracking is to measure the optical flow from one image frame to the next. In this technique, each pixel is assumed to correspond to a nearby pixel with the best-matching color. An overall combination of the pixel motions is interpreted as a motion of the camera. This method provides good results when tracking the motion between two successive image frames. However, for multiple frames, such as is needed for capturing of environments, the error in the camera position accumulates rapidly.
Another technique for camera tracking is the point correspondence method, in which distinctive looking feature points (such as object corners) are extracted from an image and tracked from one image frame to the next. These points act as fiducial points during camera tracking. This method provides better results than the optical flow method, however, it is difficult to track the same three-dimensional points from image frame to image frame and the method itself is rather slow and cumbersome.
Additionally, problems with camera tracking for the capture of images of a three-dimensional environment for later reconstruction of the environment is different from the problem of capturing data from arbitrary video sequences. The camera operator knows that she or he is trying to capture data of the environment, and is willing to move the camera in particular ways in order to do so. Thus, an interactive data collection system could provide feedback to the camera operator, giving the operator indications for keeping the camera tracking robust and to fill in desired data.
The invention provides a method and apparatus for tracking the motion of an image recording device. The method and apparatus allow a hand-held image recording device to be used.
The invention also provides a simple method and apparatus for tracking image recording device motion using a light field.
The invention further provides an interactive system that provides the operator with feedback to capture a sequence of frames that sufficiently cover the light field.
The invention additionally provides an interactive system that provides the operator with feedback to provide sufficient data for reconstruction of three-dimensional structures.
The method and apparatus of the invention locates the position and the orientation of the image recording device in each frame by checking the radiance along lines in the frame and corresponding lines in previous frames. Thus, a separate device to keep track of the location of the image recording device is not necessary.
The method and apparatus of the invention locates the image recording device""s position and orientation in each frame very precisely by checking the radiance seen along lines captured in previous frames.
These and other features and advantages of this invention are described in or are apparent from the following detailed description of the preferred embodiments.