A digital camera is a component often included in commercial electronic media device platforms. Digital cameras are now available in wearable form factors (e.g., video capture earpieces, video capture headsets, video capture eyeglasses, etc.), as well as embedded within smartphones, tablet computers, and notebook computers, etc. Three-dimensional (3D) cameras are becoming more common, and can now be found on many mobile devices/platforms. These devices provide enhanced entertainment and utility experiences to an end user. For example, photography may be enhanced by depth information output from the 3D camera.
Often, a digital camera user wishes to segment an image frame into visually distinct objects. The definition of an “object” can vary from a single instance to a whole class of objects. Once selected, special effects may be applied to one or more objects, objects from multiple photos may be mixed into one, objects may be removed from photos, etc. Such object-based image processing may be on-line, or real-time with image capture, or may be performed during post-processing.
Segmentation algorithms typically allow a user to select parts of an image or specific object of interest. In conventional tools, this is accomplished through color or texture based image segmentation. FIG. 1 is a schematic showing conventional foreground-background image data segmentation based on color or texture information. A 2D image frame 120 captured by digital camera 110 includes a representation of real world object 101 (e.g., subject person) in the foreground, a real world object 102 (e.g. tree), and a real world object 103 (e.g., sky). A foreground-background segmentation method 101 is performed, for example when a user selects a displayed region of image frame 120 corresponding to object 101. Foreground-background segmentation process 101 outputs a visual indication of a foreground segment 111 and a background segment 113. However, because real world objects are composed of multiple colors and textures, a foreground-background segmentation method may define a segment border 195 that erroneously includes some portion of object 103 in addition to object 101. This may happen because color or texture information alone is insufficient to segment a single object having multiple colors, textures, etc. that should be combined. Also, a real world scene is composed of multiple objects and not just a single foreground and background. Multiple user interaction steps may then be required to arrive at an acceptable segment, for example achieving the segment border 196.
Thus, there is a need for a multi-layer segmentation algorithm that can separate a scene into multiple objects, each with a unique label or segment ID based on color and depth information obtained using any 3D camera or 3D scanner. However, depth information included in image data is often noisy, sparse, and lower resolution compared to the color image. Also, two objects may be at indistinguishable depths. For example a person standing on a road, objects placed on a table, etc. Thus, depth alone may also be insufficient to suitably segment a scene for end user applications.
Hence, there is a need for a multi-layer segmentation algorithm that employs both the color and depth information jointly. Automated image segmentation techniques and system(s) to perform such techniques that are capable of fully integrating the richer data set generated by a 3D camera are therefore advantageous.