In general, the field of machine learning describes the design and development of algorithms and techniques that allow computers to “learn”. A major focus of machine learning research may be to extract information from data automatically. One particular area of machine learning may be concerned with detecting structures in data, which may also be known as structural perception.
Structural perception of data plays a fundamental role in pattern analysis and machine learning. Classical methods to perform structural analysis of data include principal component analysis (PCA) and multidimensional scaling (MDS) which perform dimensionality reduction by preserving global structures of data. Another current method to perform structural analysis of data may be non-negative matrix factorization (NMF) which learns local representations of data. K-means may also be frequently employed to identify underlying clusters in data. The underlying assumption behind the above methods may be that spaces in which data points, or data samples, lie are Euclidean. In other current methods, a non-Euclidean perception of data may be used. Nonlinear structures of data may be modeled by preserving global (geodesic distances for Isomap) or local (locally linear fittings for LLE) geometry of data manifolds. These two methods directed the structural perception of data in manifold ways.
In recent years, spectral graph partitioning has become a powerful tool for structural perception of data. The representative methods may be the normalized cuts for image segmentation and the Ng, Jordan and Weiss (NJW) algorithm for data clustering. For traditional spectral clustering, the structure of data may be modeled by undirected weighted graphs, and underlying clusters are found by graph embeddings. For example, the method may be used to find clusters from spectral properties of normalized weighted adjacency matrices. For semi-supervised structural perception, it may be necessary to detect partial manifold structures of data, given one or more labeled points on data manifolds. Transductive inference (or ranking) may be performed on data manifolds or graph data.
However, existing spectral methods for the structural perception of data may not be robust enough to achieve good results when the structures of data are contaminated by noise points. In addition, structural perception may not be described well in traditional Euclidean based distances. A method for structural analysis of data that may correctly perceive data structures from noisy data may be needed.