As is known in the art, it is frequently desirable to detect and segment an object from a background of other objects and/or from a background of noise. One application, for example, is in MRI where it is desired to segment an anatomical feature of a human patient, such as, for example, a vertebra of the patent. In other cases it would be desirable to segment a moving, deformable anatomical feature such as the heart.
In 1988, Osher and Sethian, in a paper entitled “Fronts propagation with curvature dependent speed: Algorithms based on Hamilton-Jacobi formulations” J. of Comp. Phys., 79:12-49, 1988, introduced the level set method, it being noted that a precursor of the level set method was proposed by Dervieux and Thomasset in a paper entitled “A finite element method for the simulation of Raleigh-Taylor instability”. Springer Lect. Notes in Math., 771:145-158, 1979, as a means to implicitly propagate hypersurfaces C(t) in a domain Ω⊂Rn by evolving an appropriate embedding function φ: Ω×[0,T]→R, where:C(t)={xεΩ|φ(x,t)=0}.  (1)
More generally, an embedding function is a real-valued height function \phi(x) defined at each point x of the image plane, such that the contour C corresponds to all points x in the plane where \phi(x)=0:C={x|\phi(x)=0}
This is a way to represent a contour C implicitly. Rather than working with a contour C (moving the contour etc), one works with the function \phi. Moving the values of \phi will implicitly move the “embedded” contour. This is why \phi(x) is called an “embedding function”—it embeds the contour as its zero-level or isoline to value 0.
The ordinary differential equation propagating explicit boundary points is thus replaced by a partial differential equation modeling the evolution of a higher-dimensional embedding function. The key advantages of this approach are well-known: Firstly, the implicit boundary representation does not depend on a specific parameterization, during the propagation no control point regridding mechanisms need to be introduced. Secondly, evolving the embedding function allows to elegantly model topological changes such as splitting and merging of the embedded boundary. In the context of shape modeling and statistical learning of shapes, the latter property allows to construct shape dissimilarity measures defined on the embedding functions which can handle shapes of varying topology. Thirdly, the implicit representation, equation (1), naturally generalizes to hypersurfaces in three or more dimensions. To impose a unique correspondence between a contour and its embedding function one can constrain φ to be a signed distance function, i.e. |∇φ|=1 almost everywhere.
The first applications of the level set method to image segmentation were pioneered in the early 90's by Malladi et al. in a paper entitled “A finite element method for the simulation of Raleigh-Taylor instability: Springer Lect. Notes in Math., 771:145-158, 1979, by Caselles et al. in a paper entitled “Geodesic active contour.” In Proc. IEEE Intl. Conf. on Comp. Vis., pages 694-699, Boston, USA, 1995, by Kichenassamy et al. in a paper entitled “Gradient flows and geometric active contour models”: In IEEE Intl. Conf. on Comp. Vis., pages 810-815, 1995 and by Paragios and Deriche in a paper entitled “Geodesic active regions and level set methods for supervised texture segmentation”: Int. J. of Computer Vision, 46(3): 223-247, 2002. Level set implementations of the Mumford-Shah functional, see paper entitled: “Optimal approximations by piecewise smooth functions and associated variational problems”: Comm. Pure Appl. Math., 42:577-685, 1989 [14] were independently proposed by Chan and Vese, see paper entitled “Active contours without edges”: IEEE Trans. Image Processing, 10(2):266-277, 2001 and by Tsai et al. in a paper entitled “Model-based curve evolution technique for image segmentation”: In Comp. Vision Patt. Recog., pages 463-468, Kauai, Hawaii, 2001.
In recent years, researchers have proposed to introduce statistical shape knowledge into level set based segmentation methods in order to cope with insufficient low-level information. While these priors were shown to drastically improve the segmentation of familiar objects, so far the focus has been on statistical shape priors (i.e., what are “priors”) which are static in time. Yet, in the context of tracking deformable objects, it is clear that certain silhouettes (such as those of the beating human in an MRI application, or of a walking person in another application) may become more or less likely over time. Leventon et al. in a paper entitled “Geometry and prior-based segmentation: In T. Pajdla and V. Hlavac, editors, European Conf. on Computer Vision, volume 3024 of LNCS, pages 50-61, Prague, 2004. Springer, proposed to model the embedding function by principal component analysis (PCA) of a set of training shapes and to add appropriate driving terms to the level set evolution equation, Tsai et al. in a paper entitled “Curve evolution implementation of the Mumford-Shah functional for image segmentation, de-noising, interpolation, and magnification” IEEE Trans. on Image Processing, 10(8): 1169-1186, 2001 suggested performing optimization directly within the subspace of the first few eigenmodes. Rousson et al. see “Shape priors for level set representations”: In A. Heyden et al., editors, Proc. of the Europ. Conf. on Comp. Vis., volume 2351 of LNCS, pages 78-92, Copenhagen, May 2002. Springer, Berlin. and “Implicit active shape models for 3d segmentation in MRI imaging”: In MICCAI, pages 209-216, 2004 suggested introduction of shape information on the variational level, while Chen et al., see “Using shape priors in geometric active contours in a variational framework”: Int. J. of Computer Vision, 50(3):315-328, 2002 imposed shape constraints directly on the contour given by the zero level of the embedding function. More recently, Riklin-Raviv et al. see European Conf. on Computer Vision, volume 3024 of LNCS, pages 50-61, Prague, 2004. Springer, proposed to introduce projective invariance by slicing the signed distance function at various angles.
In the above works, statistically learned shape information was shown to cope for missing or misleading information in the input images due to noise, clutter and occlusion. The shape priors were developed to segment objects of familiar shape in a given image. However, although they can be applied to tracking objects in image sequences see [Cremers et al., “Nonlinear shape statistics in Mumford-Shah based segmentation”: In A. Heyden et al., editors, Europ. Conf. on Comp. Vis., volume 2351 of LNCS, pages 93-108, Copenhagen, May 2002. Springer], [Moelich and Chan, “Tracking objects with the Chan-Vese algorithm”, Technical Report 03-14, Computational Applied Mathematics, UCLA, Los Angeles, 2003] and [Cremers et al., “Kernel density estimation and intrinsic alignment for knowledge-driven segmentation”: Teaching level sets to walk. In Pattern Recognition, volume 3175 of LNCS, pages 36-44. Springer, 2004], they are not well suited for this task, because they neglect the temporal coherence of silhouettes which characterizes many deforming shapes.
When tracking a three-dimensional deformable object over time, clearly not all shapes are equally likely at a given time instance. Regularly sampled images of a walking person, for example, exhibit a typical pattern of consecutive silhouettes. Similarly, the projections of a rigid 3D object rotating at constant speed are generally not independent samples from a statistical shape distribution. Instead, the resulting set of silhouettes can be expected to contain strong temporal correlations.
In accordance with the present invention, a method us provided for detecting and tracking a deformable object having an sequentially changing behavior wherein the method develops a temporal statistical shape model of the sequentially changing behavior of the embedding function representing the object from prior motion and then applies the model against future, sequential motion of the object in the presence of unwanted phenomena by maximizing the probability that the developed statistical shape model matches the sequential motion of the object in the presence of unwanted phenomena.
In accordance with another feature of the invention, a method generates a dynamical model of the time-evolution of the embedding function of prior observations of a boundary shape of an object, such object having an observable, sequentially changing boundary shape; and subsequently using such model for probabilistic inference about such shape of the object in the future.
The method develops temporal statistical shape models for implicitly represented shapes of the object. In particular, the shape probability at a given time depends is a function of the shapes of the object observed at previous times.
In one embodiment, the dynamical shape models are integrated into a segmentation process within a Bayesian framework for level set based image sequence segmentation.
In one embodiment, optimization is obtained by a partial differential equation for the level set function. The optimization includes evolution of an interface which is driven both by the intensity information of a current image as well as by a prior dynamical shape which relies on the segmentations obtained on the preceding frames.
With such method, in contrast to existing approaches to segmentation with statistical shape priors, the resulting segmentations are not only similar to previously learned shapes, but they are also consistent with the temporal correlations estimated from sample sequences. The resulting segmentation process can cope with large amounts of noise and occlusion because it exploits prior knowledge about temporal shape consistency and because it aggregates information from the input images over time (rather than treating each image independently).
The development of dynamical models for implicitly represented shapes and their integration into image sequence segmentation on the basis of the Bayesian framework draws on much prior work in various fields. The theory of dynamical systems and time series analysis has a long tradition in the literature (see for example [A. Papoulis, Probability, Random Variables, and Stochastic Processes, McGraw-Hill, New York, 1984]). Autoregressive models were developed for explicit shape representations among others by Blake, Isard and coworkers [A. Blake and M. Isard. Active Contours. Springer, London, 1998]. In these works, successful tracking results were obtained by particle filtering based on edge-information extracted from the intensity images. Here, however, the method of the present invention, differs from these in three ways:                Here the dynamical models are for implicitly represented shapes. As a consequence, the dynamical shape model can automatically handle shapes of varying topology. The model trivially extends to higher dimensions (e.g. 3D shapes), since it does not need to deal with the combinatorial problem of determining point correspondences and issues of control point re-gridding associated with explicit shape representations.        The method according to the present invention, integrates the intensity information of the input images in a statistical formulation inspired by [Zhu, Yuille 1996, Chan, Vese 1999]. This leads to a region-based tracking scheme rather than an edge-based one. The statistical formulation implies that—with respect to the assumed intensity models—the method optimally exploits the input information. It does not rely on a pre-computation of heuristically defined image edge features. Yet, the assumed probabilistic intensity models are quite simple (namely Gaussian distributions). More sophisticated models for intensity, color or texture of objects and background could be employed.        The Bayesian aposteriori optimization is solved in a variational setting by gradient descent rather than by stochastic sampling techniques. While this limits the algorithms used by the invention to only track the most likely hypothesis (rather than multiple hypotheses), it facilitates an extension to higher-dimensional shape representations without the drastic increase in computational complexity inherent to sampling methods.        
Recently, Goldenberg et al. [Goldenberg, Kimmel, Rivlin, and Rudzsky, Pattern Recognition, 38:1033-1043, July 2005.] successfully applied PCA to an aligned shape sequence to classify the behavior of periodic shape motion. Though this work is also focused on characterizing moving implicitly represented shapes, it differs from the present invention in that shapes are not represented by the level set embedding function (but rather by a binary mask), it does not make use of autoregressive models, and it is focused on behavior classification of pre-segmented shape sequences rather than segmentation or tracking with dynamical shape priors.
The details of one or more embodiments of the invention are set forth in the accompanying drawings and the description below. Other features, objects, and advantages of the invention will be apparent from the description and drawings, and from the claims.
Like reference symbols in the various drawings indicate like elements.