Attempts to describe a virtual space on the basis of a ray space theory have been proposed. See, for example, “Implementation of Virtual Environment by Mixing CG model and Ray Space Data”, IEICE Journal D-11, Vol. J80-D-11 No. 11, pp. 3048-3057, November 1997, or “Mutual Conversion between Hologram and Ray Space Aiming at 3D Integrated Image Communication”, 3D Image Conference, and the like.
A recording method of ray space data will be explained below.
As shown in FIG. 1, a coordinate system 0-X-Y-Z is defined in a real space. A light ray that passes through a reference plane P (Z=z) perpendicular to the Z-axis is defined by a position (x, y) where the light ray crosses P, and variables θ and φ that indicate the direction of the light ray. More specifically, a single light ray is uniquely defined by five variables (x, y, z, θ, φ). If a function that represents the light intensity of this light ray is defined as f, light ray group data in this space can be expressed by f(x, y, z, θ, φ). This five-dimensional space is called a “ray space”.
If the reference plane P is set at z=0, and disparity information of a light ray in the vertical direction, i.e., the degree of freedom in the φ direction is omitted, the degree of freedom of the light ray can be regenerated to two dimensions (x, θ). This x-θ two-dimensional space is a partial space of the ray space. As shown in FIG. 3, if u=tan θ, a light ray (FIG. 2) which passes through a point (X, Z) in the real space is mapped onto a line in the x-u space, said line is given by:X=x+uZ  (1) 
Image sensed by a camera reduces to receiving light rays that pass through the lens focal point of the camera at an image sensing surface, and converting their brightness levels and colors into an image. In other words, a light ray group which passes through one point, i.e., the focal point, in the real space is captured as an image a number of pixels. In this, since the degree of freedom in the φ direction is omitted, and the behavior of a light ray is examined in only the X-Z plane, only pixels on a line segment that intersects a plane orthogonal with respect to the Y-axis need to be considered. In this manner, by sensing an image, light rays that pass through one point can be collected, and data on a single line segment in the x-u space can be captured by single image sensing process.
When this image sensing is done a large number of times by changing the view point position, light ray groups which pass through a large number of points can be captured. When the real space is sensed using N cameras, as shown in FIG. 4, data on a line given by:x+Znu=Xn  (2) can be inputted in correspondence with a focal point position (Xn, Zn) of the n-th camera Cn (n=1, 2, . . . , N), as shown in FIG. 5. In this way, when an image is sensed from a sufficiently large number of view points, the x-u space can be densely filled with data.
Conversely, an observation image from a new arbitrary view point position can be generated (FIG. 7) from the data of the x-u space (FIG. 6). As shown in FIG. 7, an observation image from a new view point position E(X, Z) indicated by an eye mark can be generated by reading out data of a line given by equation (1) from the x-u space.
One major feature of ray space data is that ray space data is defined for each pixel. That is, frame data for one scene is expressed by ray space data corresponding to the number of pixels of that frame. Hence, the data size of ray space data does not depend on the complexity of a scene, but depends on only the size and resolution of the scene; the computation volume depends only on the total number of pixels of the scene to be generated. In case of normal CG data, when a scene becomes more complex, the complexity cannot be expressed unless the number of polygons is increased. Hence, the computation volume increases, resulting in low rendering performance. However, in case of ray space data, if the total number of pixels of an image to be rendered remains the same, rendering performance is constant independently of the complexity of scenes.
A case will be examined wherein the user walks through a virtual space generated using such ray space data. In such case, several ten frames of images per sec must be generated and presented to make the user feel as if he or she was walking through the virtual space.
However, when the total number of pixels of an image reconstructed from ray space data in each frame is large, a long rendering time per frame is required, and the rendering frame rate cannot catch up with the moving speed.
When the user manipulates (moves, rotates, or the like) an object in the scene at high speed, and that object is generated and rendered based on ray space data, if the number of pixels of the object is large, rendering cannot be done in time.