Three-dimensional computer models of a real-world environment are useful in a wide variety of applications. For example, such models can be used in applications such as immersive gaming, augmented reality, architecture/planning, robotics, and engineering prototyping. Depth cameras (also known as z-cameras) can generate real-time depth maps of a real-world environment. Each pixel in these depth maps corresponds to a discrete distance measurement captured by the camera from a 3D point in the environment. This means that these cameras provide depth maps which are composed of an unordered set of points (known as a point cloud) at real-time rates.
In addition to creating the depth map representation of the real-world environment, it is useful to be able to perform a segmentation operation that differentiates individual objects in the environment. For example, a coffee cup placed on a table is a separate object to the table, but the depth map in isolation does not distinguish this as it is unable to differentiate between an object placed on the table and something that is part of the table itself.
Segmentation algorithms exist, such as those based on machine-learning classifiers or computer vision techniques. Such algorithms are able to differentiate certain objects in an image, and label the associated pixels accordingly. However, these algorithms can be computationally complex and hence demand substantial computational resources, especially to perform the segmentation on real-time depth maps.
The embodiments described below are not limited to implementations which solve any or all of the disadvantages of known segmentation techniques.