Low-Rank Representation of High-Dimensional Data
Recovering and using a low-rank representation of high-dimensional data are frequently used in computer vision, statistical learning, and neural networks.
Principal Component Analysis (PCA) is one method of determining an optimal low-rank representation of the high-dimensional data in a l2-norm sense. However, PCA is susceptible to statistical outliers, which are ubiquitous in realistic image data, due to occlusion, illumination and noise.
Numerous methods are known for error resilient PCA, e.g., by RANdom SAmple Consensus RANSAC), influence function techniques, and l1-norm minimization.
Using the l1-norm techniques, the low-rank recovery can be classified into two groups:
(1) Nuclear Norm Minimization (NNM); and
(2) Matrix Factorization (MF).
Nuclear Norm Minimization
NNM uses a nuclear norm as a surrogate for a highly non-convex rank minimization, such as Robust PCA (RPCA), and a Sparse and Low-Rank Matrix Decomposition (SLRMD). RPCA and SLRMD share the same idea. RPCA is a convex problem with a performance guarantee. RPCA assumes acquired data are in a form of an observation matrix X. The matrix Xεm×n has two additive components:                (1) a low-rank matrix A (rank: r<<min{m,n}); and        (2) a sparse matrix E.        
By minimizing the nuclear norm of A and the l1-norm of E, RPCA can recover a low-rank matrix from corrupted data as follows:
                              min                      A            ,            E                          ⁢                  λ          ⁢                                  A                                *                      +                                                          E                                            1                                ⁢                                          ⁢                      s            .            t            .                                                  ⁢            A                              +      E        =    X    ,where the nuclear norm ∥A∥* is equal to a sum of singular values
                  A              *=                  ∑                  i          =          1                          min          ⁢                      {                          m              ,              n                        }                              ⁢                                              s            i                                    1              ,where λ is a weighting coefficient, and si are the singular values, which are simply the absolute values of eigenvalues in case A is a normal matrix A*A=AA*, where A* is a conjugate transpose of A.
However, at each iteration, RPCA needs to perform a Singular Value Decomposition (SVD) and do shrinkage in the principal subspace, which is computationally demanding with a complexityO(min(m2n,mn2)).
Instead of the SVD, a partial RPCA only determines some major singular values. However, the partial RPCA requires prediction of the dimension of the principal singular space. To reduce the computational burden of the SVD, random projection based RPCA (RP-RPCA) enforces the rank minimization on a randomly projected matrix A′=PA, instead of on the larger matrix A.
Unfortunately, due to the large null space of the projection matrix P, RP-RPCA requires that the SVD is applied to different projected matrices A′ at each iteration, and the resulting overall computational burden is higher instead. In addition, the low-rank recovery of the matrix A is not guaranteed.
Matrix Factorization
To accelerate the low-rank recovery, matrix factorization (MF) can be used. MF includes Robust Matrix Factorization (RMF), Robust Dictionary Learning (RDL), and Low-rank Matrix Fitting (LMaFit).
MF relaxes the rank minimization by representing the matrix A as A=Dα under some compact subspace spanned by Dεm×k, with coefficients αεk×n and r≦k≦≦min(m,n).
The matrix A is recovered by iterating between solving the coefficients α, and updating the dictionary D. Because the complex SVD is avoided, the MF methods are superior to RPCA in terms of computational efficiency.
However, MF methods suffer from potential drawbacks:                (1) the non-convex model can lead to local minima;        (2) an initial rank estimate is required, which is not easily obtained especially in the case that the low-rank matrix is corrupted by outliers with a dominant magnitude; and        (3) the quadratic complexity is still high for large-scale low-rank matrix recovery.        
Group sparsity is a common regularizer in sparse coding and compressive sensing (CS).