1. Field of Invention
The present invention relates to the field of matrix factorization. More specifically, it relates to the field of matrix factorization with incorporated data classification properties.
2. Description of Related Art
Matrix factorization is a mechanism by which a large matrix U (where Uε) is factorized into the product of two, preferably smaller matrices: a basis matrix V (where Vε) and a coefficient matrix X (where Xε). A motivation for this is that is often easier to store and manipulate smaller matrices V and X, than it is to work with a single, large matrix U. However, since not all matrices can be factorized perfectly, if at all, matrices V and X are often approximations. An objective of matrix factorization is therefore to identify matrices V and X such that when they are multiplied together, the result closely match matrix U with minimal error.
Among the different approaches to matrix factorization, an approach that has gained favor in the community is nonnegative matrix factorization (NMF) due to its ease of implementation and useful applications
Nonnegative matrix factorization has recently been used for various applications, such as face recognition, multimedia, text mining, and gene expression discovery. NMF is a part-based representation wherein nonnegative inputs are represented by additive combinations of nonnegative bases. The inherent nonnegativity constraint in NMF leads to improved physical interpretation compared to other factorization methods, such as Principal Component Analysis (PCA).
Although NMF, and its variants, are well suited for recognition applications, they lack classification capability. The lack of classification capability is a natural consequence of its unsupervised factorization method, which does not utilize relationships within input entities, such as class labels.
Several approaches have been proposed for NMF to generate more descriptive features for classification and clustering tasks. For example, “Fisher Nonnegative Matrix Factorization”, ACCV, 2004, by Y. Wang, Y. Jiar, C. Hu, and M. Turk, proposes incorporating the NMF cost function and the difference of the between-class scatter from the within-class scatter. However, the objective of this Fisher-NMF is not guaranteed to converge since it may not be a convex function. “Non-negative Matrix Factorization on Manifold”, ICDM, 2008, by D. Cai, X. He, X. Wu, and J. Han proposes graph regularized NMF (GNMF), which appends terms representing favorable relationships among feature vector pairs. But, GNMF is handicapped by not considering unfavorable relationships.
A different approach better suited for classification is a technique called “graph embedding”, which is derived from topological graph theory. Graph embedding, embeds a graph G on a surface, and is a representation of graph G on the surface in which points of the surface are associated with vertices.
Recently, J. Yang, S. Yang, Y. Fu, X. Li, and T. Huang suggested combining a variation of graph embedding with nonnegative matrix factorization in an approached termed “Non-negative graph embedding” (NGE), in CVPR, 2008. NGE resolved the previous problems by introducing the concept of complementary space so as to be widely considered the state-of-the-art. NGE, however, does not use true graph embedding, and instead utilizes an approximate formulation of graph embedding. As a result, NGE is not effective enough for classification, particularly when intra-class variations are large.
In a general sense, all of these previous works tried to incorporate NMF with graph embedding, but none of them successfully adopted the original formulation of graph embedding because the incorporated optimization problem is considered intractable. In addition, all the works are limited in that they depend on suitable parameters which are not easy to determine appropriately.
It is an object of the present invention to incorporate NMF with graph embedding using the original formulation of graph embedding.
It is another object of the present invention to permit the use of negative values in the definition of graph embedding without violating the requirement of NMF to limit itself to nonnegative values.