In many applications high-dimensional data is to be exploited, such as for image processing, object recognition, information retrieval, audio/video processing, bioinformatics and other applications. Data such as images, audio files, videos, text documents and the like typically lie in high-dimensional feature spaces where the number of dimensions may be a six digit figure or higher. For example, given a corpus of text documents, the text documents may be represented in a high dimensional space where each unique word which is present in at least one of the documents is a dimension.
In order to exploit such high-dimensional data in tasks such as object recognition, document clustering and the like, one option is to map the data to a lower dimensional space or to find lower dimensional structure within the high-dimensional data. One approach has been to use Principal Component Analysis (PCA) to find lower dimensional structure within high dimensional data. However, the presence of noise and distortion in the data may act to degrade the performance of known PCA processes. This may happen for example, where image capture devices introduce additive noise and/or where occlusions are present in a captured image. For applications where noise and distortion present a problem one option is to use a modified PCA process which is more robust. However, robust PCA processes often suffer from efficiency problems with scalable input data and real-time applications.
The embodiments described below are not limited to implementations which solve any or all of the disadvantages of known processes for finding low-dimensional structure from high-dimensional data.