Photogrammetry-derived virtual environments for use in virtual reality (VR), museum exhibits, video games, and digital cinema are limited to scenes featuring fixed light sources, such as the sun which, in the context of this application, is relatively fixed, and artificial lights. Since photogrammetry relies on sequences of overlapping photos taken from slightly converged angles, the implication is that fixed lighting sources produce shadows, specular reflections and for some materials subsurface reflections that obfuscate the true color and surface features over portions of items in a scene. Fixed light sources can similarly influence data captured using other scanning methodologies.
Studio-based techniques for modeling objects are well-known. To date, such methods introduce an item before an image-capture system bound to a location such as a studio or factory floor where an array of cameras and controlled artificial light sources, such as soft boxes, light stages, light projectors, etc., are placed around the object.
For example, techniques for modeling layered facial reflections consisting of specular reflectance, single scattering, shallow and deep sub-surface scattering from the skin of a human face are illustrated and described in U.S. Patent Application Publication Number 2009/0226049 A1 to Debovec et al. (hereinafter referred to as Debovec). Parameters for appropriate reflectance models are derived from 20 photographs recorded in a few seconds from a single viewpoint in a studio environment. Debovec introduces image-capture systems that use a plurality of light sources with controllable output intensities to produce spherical gradient illumination patterns of a person's face. Both the subject-of-interest and the light sources are stationary and generally limited to the confines of a studio. Polarizing filters are arranged adjacent to the light sources to polarize the light from the light sources in a desired orientation. The system includes two or more cameras with a desired polarization adjusted manually. A light projector is added to illuminate a desired portion of persons face. An image processing system receives specular reflectance and diffuse reflectance data from the cameras and calculates reflectance for the facial image based on a layered facial reflectance model. The systems and methods disclosed by Debovec are resource intensive and impractical for capturing images and constructing models of scenes in a non-studio environment.
Images of real-world environments captured during daytime hours present challenges due to the presence of continuous sunlight, the possible presence of ambient light from artificial sources and flash sources when used. Light from each of these sources combines under some operational conditions. Artificial light is affected by its respective inverse square distance from a subject-of-interest, while sunlight is not. The contribution from a flashtube or flashlamp, which release light energy over milliseconds, is mostly unaffected by shutter speed. However, a camera operator subsampling a continuous light source such as the sun or light from an artificial light fixture, when working from a non-stationary platform, can adjust shutter speed until the shutter is fast enough so as not to introduce issues with temporal resolution.
Ambient continuous light from the sun and fixed and unfixed light fixtures separate from a camera, will necessarily introduce fixed shadows in captured images, which are problematic to the development of virtual environments requiring computer graphics (CG) lighting. In the case of a continuous artificial light source, such as a light-emitting diode (LED) based strobe, which continues to release light energy for as long as a power supply can continue to provide sufficient input power, a slower shutter speed enables more light to contact a photosensitive array but with an increased likelihood of loss of temporal resolution for freestanding cameras.
To appear realistic, a virtual environment, even in the presence of simulated fixed light sources and fixed shadows, ideally adapts to changes in the perspective of the observer relative to the scene. Specifically, specular information should change relative to changes between the observer and reflective surfaces of objects in the scene. Specular reflections are typically simulated with a diffuse shader in a layered arrangement under a specular shader. As disclosed by Debovec, additional layers can be included to simulate subsurface scattering of light in partially translucent materials.
Images of real-world environments captured during nighttime hours or in locations blocked from sunlight present challenges when ambient light from artificial sources and flash sources are used to illuminate a scene. Known artificial lighting techniques for minimizing shadows in captured images outside of a studio are problematic for a number of reasons. Generally, there is difficulty in transporting, locating, coordinating and energizing artificial light sources outside a studio environment. Consequently, it is often the case that the combination of natural and artificial light provides insufficient light to accommodate adequate surface-of-interest coverage because of distance, light absorption or both. Under insufficient light conditions, a photographer will increase exposure times and aperture and if possible move closer to the surface-of-interest. However, these longer exposure times necessitate the use of tripods to stabilize the camera position. When thousands of images may be required to map a real-world scene it is impractical to closely position a camera to a surface-of-interest, capture an image, then relocate and adjust a tripod to position the camera for each subsequent exposure necessary to capture a real-world scene.
To avoid the inconvenience and effort of transporting and positioning a tripod for each exposure, one or more artificial light sources, such as strobes, can be synchronized to a shutter mechanism to a minimum of about 1/125th of a second for focal plane shutters on most digital single lens reflex (DSLR) cameras. However, photography dependent on artificial lighting capable of anything less than millisecond enabled strobe lighting, e.g., ambient light from the sun and fixed and unfixed light fixtures, will introduce shadows in the captured images.
Specialized lighting is called for when collecting image information for generating virtual environments supporting realistic lighting effects with regard to shifting specular reflections accompanying changes in perspective, shifting shadows accompanying any change in position and possibly rotation of a virtual light source, as well as a host of other changes in the quality of specular reflections and shadows in response to changes in as many parameters governing the physics of the virtual light source, such as virtual reflectors, collimators, and diffusers.
Because the scanning of environments, especially those with many occluded surfaces, requires constant movement of the capture system in order to avoid data shadows, portability of the system is a primary consideration. Lighting hardware with sufficient output to properly expose surfaces in an environment, as opposed to surfaces of a smaller object within an environment, often implies wall current and bulky power supplies, implying a compromise to portability and nuisance factor dealing with power chords. Considering the sheer volume of photographs required for adequate coverage, use of lights on stands is highly impractical if these must be repositioned and adjusted whenever the camera moves to a new position and is redirected, with the result that the lighting needs change accordingly.
The second problem with lights on stands is that they cast shadows. The use of soft boxes goes far to mitigate hard shadows by diffusing incident light rays from the flash tube envelope as they pass through diffuser material on the front side, but these large devices only exacerbate the problem with impracticality as it is entirely impractical to deploy soft boxes to sufficiently illuminate many real-world environments.
The most effective and efficient workflow supporting realistic virtual lighting of a photorealistic virtual scene, wherein moving a virtual light results in moving its cast shadows, is to avoid introducing shadows into the source photography. A ring strobe directs light that is substantially on-axis with the center axis of the sensor, thus casting shadows behind subject matter, while at the same time providing a highly portable form factor, the illumination source being fixed to the camera.
While an on-axis light source such as a ring light dramatically reduces shadows, light output using conventional ring strobes for purposes of three-dimensional capture is frustrated by numerous factors. Conventional ring strobes are designed to accommodate a range of lens diameters, being somewhat oversized to satisfy for the lowest common denominator at the larger end of the range of available lens housing diameters. The presence of, albeit highly reduced, shadows not only isn't a problem for most applications other than photogrammetry, one-size-fits-all ring strobes are in fact popular among fashion photographers whereby the presence of subtle shadows is a valued aesthetic, for instance the shadow under a model's nose that helps sculpting its shape. As this reduced shadowing applies to photogrammetric capture, incident light angles for greater distance to subject values become less of an issue, while closer proximity of the camera and ring-strobe to nearest surfaces in the foreground predictably projects shadows onto recessed and background surfaces. The limitation is most problematic when attempting to capture subject matter featuring deep and narrow voids, such as through holes carved into a wooden chair back. As the camera fitted with ring-strobe is brought in up close to capture the interior walls of the through holes, even the slightest gap between the lens and surrounding ring-strobe can thwart slightly off-axis incident light rays from reaching into the deep voids to illuminate the interior walls, the outer periphery of each through hole casting them into shadow. To minimize shadows, emitters must be placed as close to the periphery of the lens as possible, but it's not enough to place a couple on each side of the lens, or four evenly spaced around the lens at 12, 3, 6, and 9 o'clock, nor any greater number that doesn't contiguously populate the entirety of the lens periphery, as required to minimize shadows from any number of possible protruding surfaces relating to recessed spaces relative to the camera and ring light in three-dimensional space.
Light output appropriate to a device aimed at recording diffuse and at the same time shadow-free color information of machine parts, under a microscope, inside the human body, or of a given section of the body contends with relatively insignificant impediments toward those ends as compared to what's required in a device aimed at volumetric capture of real world environments. A host of factors conspire to limit what's possible in scaling light output from applications dealing with micro scale and closeup work in a medical facility, industrial setting, or objects in a studio to the specific requirements of volumetric capture of real world settings, the inverse square law of light and lower signal/noise ratio due to insertion loss from polarizers being just the beginning.
While an on-axis light source such as a ring light minimizes shadows, an on-axis light source exacerbates specular reflections. With light rays coming directly from the camera, all camera-facing normal vectors across surfaces within the frame, these consisting of materials on the glossy end of the roughness spectrum, naturally reflect right back into the lens. Prior art techniques for reducing specular reflection use cross-polarization filters. That is, placing a first polarizer on the light source at 90° with respect to a second polarizer on the lens. However, the loss of through light with thin-film polarizers leads to a combined filter factor of upwards of 3.5 f-stops of available light at the image sensor. The f-number, f-stop number or relative aperture is a dimensionless ratio of the focal length of a lens to the diameter of the aperture. The f-stop number provides a quantitative measure of lens speed. A doubling of the f-stop number halves the size of the aperture. Consequently, each f-stop represents a doubling or halving of the light depending on whether the aperture adjustment is increasing or decreasing the size of the opening. Thus, the use of cross-polarization introduces difficulties in providing sufficient illumination over a practical image area and separation distance between a subject or subjects of interest in a non-studio environment and the camera to achieve an adequate exposure at practical shutter speed, sensitivity and aperture settings.
Light output for purposes of three-dimensional capture is frustrated by numerous factors. To minimize shadows, emitters must be placed as close to the periphery of the lens as possible. Adequate light output can be achieved with concentric rings of emitters, but with every concentric array of emitters, the angle of incidence relative to the center axis of the lens increases, thus casting ever more shadows.
Various camera settings can be leveraged to compensate for inadequate illumination, but each variable runs up against severe constraints imposed by the requirements placed upon photogrammetric data to be useful. For instance, by decreasing shutter speed more light is allowed to strike the sensor for a longer period of time, but because of the need for the capture system to remain highly portable, any movement introduced during an exposure, such as with handheld photography or working off any camera platform that isn't fixed, such as from poles, ropes, or a UAV, may result in useless data. Imagery lacking sharp temporal resolution compromises quality when such images are used for photo projection mapping, and in the case of photogrammetry, such data is entirely useless as a photogrammetry engine searching for common points of interest between overlapping photos has no hope of locking in on imagery plagued by motion blur.
Opening the lens aperture is used to deliver more available light to sensors, but here the softness in pixels, and thus their ruin for 3D capture, is often the result of the shorter depth of field accompanying lower F-stops, quickly throwing nearby and more distant subject matter for given focal plane out of focus. Lastly, digital cameras turn to higher ISO values, driving up the gain of the sensor to boost the signal at a given illumination level. Boosting a signal, of course, also boosts noise, the problem here being that noise is unsightly at best, and in the case of photogrammetry, large grain size confuses a structure from motion (SfM) engine when identifying features in separate images and then matching the features between overlapping images to serve as key points.
A conventional and portable solution for reducing shadows is described in U.S. Pat. No. 6,430,371 to Cho (hereinafter referred to as Cho), which integrates a ring light guide with a camera. The guide includes a housing attached to the camera by way of an adapter insertion hole having an axis that is coaxial with the lens of the camera. The ring light guide irradiates light toward an object in a direction that is substantially aligned with an axis of the lens of the camera. Cho further describes adjusting the amount of light irradiated to the object dependent upon a camera to object distance. However, the combination disclosed by Cho is limited to objects that are close to the lens. Cho fails to show a combination that addresses light loss from cross polarization that would apply to the capture of subject matter that may be beyond a few feet away from the lens. Cho also describes a manual approach to controlling polarization states, with emphasis on cross-polarization used to cut specular reflections on machine parts and human skin to return diffuse color. No route is described to also record images containing diffuse color and specular reflections, and more importantly in a form such data can be utilized to isolate specular reflections enabling computer graphics lighting in a lighting and rendering engine.