Edge detection is a fundamental problem in computer vision. Edge detection provides important low level features for many applications. Edges in images of a scene can result from different causes, including depth discontinuities, differences in surface orientation, surface texture, changes in material properties, and varying lighting.
Many methods model edges as changes in low-level image properties, such as brightness, color, and texture, within an individual image. Yet, the issue of indentifying image pixels that correspond to 3D geometric boundaries, which are discrete changes in surface depth or orientation, has received less attention.
Raskar, in U.S. Pat. No. 7,295,720 B2, detects depth edges by applying a shadow-based technique using a multi-flash camera. That method applies only to depth discontinuities, not changes in surface normal, and requires a controlled set of lights that encircles the lens of a camera.
3D geometric boundaries accurately represent characteristics of scenes that can provide useful cues for a variety of tasks including segmentation, scene categorization, 3D reconstruction, and scene layout recovery.
In “Deriving intrinsic images from image sequences,” ICCV 2001, Volume: 2, Page(s): 68-75 vol. 2, Weiss et al. describe a sequence of images of a scene that undergoes illumination changes. Each image in the sequence is factorized into a product of a single, constant reflectance image and an image-specific illumination image.
U.S. Pat. No. 7,756,356 describes factoring a time-lapse photographic sequence of an outdoor scene into shadow, illumination, and reflectance components, which can facilitate scene modeling and editing applications. That method assumes a single point light source at infinity (the sun), which is moving smoothly over time, and an ambient lighting component.
In “Appearance derivatives for isonormal clustering of scenes,” IEEE TPAMI, 31(8):1375-1385, 2009,” Koppal et al. describe image sequences that are acquired by waving a distant light source around a scene. The images are then clustered into regions with similar surface normals. That work also assumes a single distant light source whose position varies smoothly over time and an orthographic camera model.