The problem of segmenting and tracking low-dimensional subspaces embedded in high dimensional data arises in many applications such as background segmentation, anomaly detection, motion segmentation, and target localization. For example, a scene acquired by a stationary or moving camera can be partitioned into a low-rank component spanning a subspace that characterizes a relatively stationary background in the scene, and a sparse component corresponding to moving objects in the video scene, usually in the foreground.
The problem is to identify, at every time step t, e.g., each image in a sequence of images, an r-dimensional subspace t in n with r<<n that is spanned by columns of a rank-r matrix Utεn×r from incomplete and noisy measurementsbt=Ωt(Utat+st),   (1)where Ωt is a selection operator that specifies subset of the sets of one or more images at time t, atεr are coefficients specifying linear combination of columns of the subspace Ut, and stεn is a vector of sparse outliers.
When the subspace t is stationary, the subscript t is omitted from Ut and the problem reduces to matrix completion or principal component analysis (PCA) where the task is to separate a matrix BεE n×m into a low-rank component UA, and a sparse component S using incomplete measurementsBΩ=Ω(UA+S).
The columns of the matrices A and S are respectively the vectors at and st stacked horizontally for all tε{1 . . . m}, and the selection operator Ω specifies the measured data in the matrix B.
Conventional methods for low-dimensional subspace identification first organize the measured data into a matrix and then determine basis vectors that span the subspace using a variety of techniques, e.g., low-rank matrix factorization.
Extensions of those methods factor the matrix into a low-rank component corresponding to the subspace, and a sparse component that represents noise.
However, when the dimensionality of the data becomes large, as is the case of a video, latency becomes a problem. Hence, it is necessary to provide a method that can segment and track the low-dimensional subspace as the data are acquired or processed in real-time, even when the data are incomplete and corrupted by sparse noise. Another problem is that the low-dimensional subspace (background) can vary over time, in which case the subspace cannot be represented by a low rank matrix when all data are grouped into one matrix. For example, the background in outdoor scene can vary in illumination during the day. Similarly, erstwhile moving objects can be added or removed from the background, where the are stationary, in surveillance videos.
One prior art method, U.S. Pat. No. 7,463,754, “Adaptive probabilistic visual tracking with incremental subspace update,” describes a method for adaptive probabilistic tracking of an object as represented in a motion video. The method identifies an eigenbasis that represents the object being tracked. A maximum a posteriori estimate of the object location is determined using the current estimate of the eigenbasis. The eigenbasis is then updated to account for changes in the appearance of the target object.
Another prior art, US 20030108220, “Robust, on-line, view-based appearance models for visual motion analysis and visual tracking,” describes learning an appearance model that includes both a stable model component, learned over a long time course, and a transient component, learned over a relatively short time course. The model parameters are adapted over time using an online expectation-maximization (EM) algorithm.
U.S. Pat. No. 8,477,998, “Object tracking in video with visual constraints,” describes tracking objects represented in a video by determining tracking states of the object based on a pose model, an alignment confidence score, and an adaptive term value. The tracking state defines a likely position of the object in the frame given the object's likely position in a set of previous frames in the video.
Grassmanian Rank-One Update Subspace Estimation (GROUSE) is one method that can handle real-time subspace estimation from incomplete data, see Balzano et al., “Online identification and tracking of subspaces from highly incomplete information,” 48th Annual Allerton Conference on Communication, Control, and Computing (Allerton), pp. 704-711, Sep. 2010. GROUSE uses rank-one updates of the subspace on a Grassmannian manifold. However, GROUSE can get trapped at a local minima.
Parallel Subspace Estimation and Tracking by Recursive Least Squares (PETRELS) can also identify a low-dimensional subspace in real-time, see Chi et al., “Petrels: Parallel subspace estimation and tracking by recursive least squares from partial observations,” IEEE Transactions on Signal Processing, vol. 61, no. 23, pp. 5947-5959, 2013. PETRELS minimizing, in parallel, a geometrically discounted sum of projection residuals on the data for each time step using a recursive procedure with discounting for each row of the subspace matrix. Both GROUSE and PETRELS cannot correctly handle corrupted data and data subject to non-Gaussian noise.
Grassmannian Robust Adaptive Subspace Tracking Algorithm (GRASTA) is similar to GROUSE, see Cornell University, arXiv:1109.3827, Sep. September 2011. GRASTA also updates the Grassmannian manifold, but replaces the l2 cost function of GROUSE with a l1-norm cost function. This cost function minimizes a sum of absolute errors, while correctly handling outliers in the data.
Another real-time method is Recursive Projected Compressive Sensing (ReProCS), see Qiu et al. “ReProCS: A missing link between recursive robust PCA and recursive sparse recovery in large but correlated noise,” CoRR, vol. abs/1106.3286, 2011. ReProCS recursively projects data onto a orthogonal complement of the subspace followed by sparse recovery to determine outliers. However, that method requires an accurate initial estimate of the subspace.