The production of three-dimensional or “3D” images, including both single-frame and video images, has applications in many fields including science, medicine, and entertainment. In some instances, the 3D images are displayed to a viewer with a two-dimensional display such as a television or movie screen. The display modifies the two-dimensional image to enable the viewer to perceive a 3D image on the screen. In other applications, three-dimensional data are extracted from a two dimensional image. For example, the distance or depth of an object as viewed from a camera can be identified in three dimensional image data for a variety of uses. Computerized systems including machine vision systems used in medical and industrial applications utilize depth information and other three-dimensional data generated from 3D image data even if a human does not view the 3D images.
Traditional 3D imaging systems include two cameras that generate binocular images in much the same way that humans perceive three-dimensional environments with both eyes. Two corresponding images from each camera combine into a composite image using various techniques known to the art to enable a viewer to perceive three-dimensions from a two-dimensional image. In some embodiments, the viewer views both two-dimensional images simultaneously with one eye viewing each image. The two cameras used in traditional 3D imaging systems, however, increase the size and complexity of the imaging system. For example, both cameras have to be properly aligned and focused to generate two appropriate images that can be combined to produce a 3D composite image.
An alternative imaging technique referred to as “depth from defocus” produces three-dimensional image data using a single camera. In a depth from defocus system, a single camera generates two images of a single scene. One image of the scene focuses on an object within the scene, while the camera is defocused from the object in the second image. The depth from defocus technique identifies the amount of blur that is introduced between the focused image and the defocused image. Once the depth from defocus technique identifies the blur, represented by the term a, the depth of the object D in the two images can be identified using the following equation:
  D  =            -      v              (                        σ                      ρ            ⁢                                                  ⁢            r                          -                  v          f                +        1            )      where r is a radius of the lens aperture of the camera, ν is a distance between the lens and the image sensor in the camera, f is the focal length of the optics in the camera (depicted in FIG. 3A and FIG. 3B), and ρ is a predetermined camera constant. The depth from defocus technique generates a “depth map” that includes a data corresponding to the depths of various objects and regions depicted in an image.
While the depth from defocus technique enables a single camera to generate three-dimensional image data, existing imaging systems using the depth from defocus technique also have limitations in practical use. Since the depth from defocus technique uses two different images of a single scene, the camera changes focus and generates two different images at two different times. In a static scene with no moving objects the two images depict the same scene, but the two images may not correspond to each other and the depth data cannot be calculated accurately in a dynamic scene with moving objects. For similar reasons, the depth-from-defocus technique presents challenges to video imaging applications that typically generate images at a rate of 24, 30, or 60 frames per second. In a depth from defocus imaging system, the camera generates two images for each standard frame of video data at a corresponding rate of 48, 60, or 120 images per second, respectively, and the camera changes focus between each pair of images. Existing imaging systems have difficulty in changing the focus of the lens and in generating three-dimensional image data with the depth-from-defocus technique to generate video at commonly used frame rates. Consequently, improvements to imaging systems that increase the imaging speed of a camera generating three-dimensional image data would be beneficial.