1. Field of the Invention
The present invention relates to scaling of multi-dimensional data sets and, more particularly, to non-linear mapping of a sample of points from a multi-dimensional data set, determining one or more non-linear functions for the mapped sample of points, and mapping additional points using the one or more non-linear functions, including mapping members of the original multi-dimensional data set and mapping new and previously unseen points.
2. Related Art
Conventional techniques for multi-dimensional scaling do not scale well for large multi-dimensional data sets.
What is needed is a method, system, and computer program product for multi-dimensional scaling, which is fast and efficient for large multi-dimensional data sets.
A method, system and computer program product for scaling, or dimensionally reducing, multi-dimensional data sets, that scales well for large data sets. The invention scales multi-dimensional data sets by determining one or more non-linear functions between a sample of points from the multi-dimensional data set and a corresponding set of dimensionally reduced points, and thereafter using the non-linear function to non-linearly map additional points. The additional points may be members of the original multi-dimensional data set or may be new, previously unseen points. In an embodiment, the invention begins with a sample of points from an n-dimensional data set and a corresponding set of m-dimensional points. Alternatively, the invention selects a sample of points from an n-dimensional data set and non-linearly maps the sample of points to obtain the corresponding set of m-dimensional points. Any suitable non-linear mapping or multi-dimensional scaling technique can be employed. The process then trains a system (e.g., a neural network), using the corresponding sets of points. During, or at the conclusion of the training process, the system develops or determines a relationship between the two sets of points. In an embodiment, the relationship is in the form of one or more non-linear functions. The one or more non-linear functions are then implemented in a system. Thereafter, additional n-dimensional points are provided to the system, which maps the additional points using the one or more non-linear functions, which is much faster than using conventional multi-dimensional scaling techniques. In an embodiment, the determination of the non-linear relationship is performed by a self-learning system such as a neural network. The additional points are then be mapped using the self-learning system in a feed-forward manner.