The subject matter discussed in the background section should not be assumed to be prior art merely as a result of its mention in the background section. Similarly, a problem mentioned in the background section or associated with the subject matter of the background section should not be assumed to have been previously recognized in the prior art. The subject matter in the background section merely represents different approaches, which in and of themselves may also correspond to implementations of the claimed inventions.
Motion-capture systems are used in a variety of contexts to obtain information about the conformation and motion of various objects, including objects with articulating members, such as human hands or human bodies. Such systems generally include cameras to capture sequential images of an object in motion and computers to analyze the images to create a reconstruction of an object's volume, position and motion. For 3D motion capture, at least two cameras are typically used.
Image based motion capture systems rely on the ability to distinguish an object of interest from other objects or background. This is often achieved using image-analysis algorithms that detect edges, typically by comparing pixels to detect abrupt changes in color and/or brightness. Such conventional systems, however, suffer performance degradation under many common circumstances, e.g., low contrast between the object of interest and the background and/or patterns in the background that may falsely register as object edges. This may result, for example, from reflectance similarities—that is, under general illumination conditions, the chromatic reflectance of the object of interest is so similar to that of surrounding or background objects that it cannot easily be isolated.
Optical filters may be used to enhance object discrimination. In a typical setup, a source light illuminates the object(s) of interest, and motion of the object(s) is detected and tracked based on reflected source light, which is sensed by one or more cameras directed at the scene. Most simply, narrowband source light can be used with corresponding band-pass filters in front of the cameras; in this way, the cameras “see” only the source light and not light from general illumination.
The reliability of this approach can degrade in various situations, e.g., when surrounding or background objects are close to the objects of interest. In such circumstances, the signal-to-noise ratio for discrimination diminishes to the point of inability to reliably distinguish foreground from background. One approach to mitigating this degradation is to capture separate successive images, one under general illumination and the other, obtained immediately thereafter, under illumination from a narrowband source light. The differently illuminated images may be compared and the general-illumination image used to remove noise from the narrowband-illumination image. This may be accomplished, for example, using the ratio between the two images (i.e., taking the pixel-by-pixel amplitude ratios and eliminating, from the narrowband image, pixels whose ratio falls below a threshold).
A limitation of this approach is latency resulting from the need to obtain and process two successive image frames. Conventional image sensors include complementary metal-oxide semiconductor (CMOS) devices and charge-coupled devices (CCDs). Both types of image sensor typically include an array of photosensitive elements (pixels) that collect charge carriers in response to illumination. In a CCD, the charge is actually transported across the chip and read at one corner of the array, where it is converted to a voltage from which an image may be reconstructed by associated circuitry. The time required to move the charge from the pixels represents the exposure time (also called the integration time) of the CCD; after this time has elapsed the CCD is ready to receive a new image, even if the displaced charges are still being processed by the readout circuitry. The integration time is a key source of latency in image-acquisition and processing systems, and in a system designed to detect and characterize motion, this delay can be particularly problematic since components of the captured scene will have shifted from frame to frame. The objective of removing noise from an image may be undermined by the additional noise introduced by this shift.
An opportunity arises to address background noise with reduced latency.