1. Technical Field
The present invention relates to processing data, and more particularly to systems and methods for determining a composite distance metric between data from different sources.
2. Description of the Related Art
Distance metric learning is one of the most fundamental problems in data mining. Existing techniques aim at constructing a single distance metric directly from the data. However, in real applications, multiple base distance metrics may already exist. For example, in healthcare applications, different physicians may have different patient distance metrics in mind.
Distance Metric Learning (DML) is applicable in data mining and machine learning fields. Most DML algorithms are learned directly from the data. Depending on the availability of supervision information in the training data set (e.g., labels or constraints), a DML algorithm can be classified as unsupervised, or semi-supervised and supervised. In particular, supervised DML (SDML) constructs a proper distance metric that leads data from the same class closer to each other, while data from different classes are moved further apart from each other. In fact, SDML can be categorized as including global and local methods. A global SDML method attempts to learn a distance metric that keeps all data points within the same class close, while separating all data points from different classes far apart. Typical approaches in this category include Linear Discriminant Analysis (LDA) and its variants.
Although global SDML approaches achieve empirical success in many applications, it is difficult for a global SDML to separate data from different classes since the data distribution is usually very complicated (e.g., the data from different classes are entangled with each other). Local SDML methods, on the other hand, first construct local regions (e.g., neighborhoods around each data point) and, in each local region, attempts to pull data within the same class closer while pushing data in different classes further apart. Some representative algorithms include Large Margin Nearest Neighbor (LMNN) classifiers, Neighborhood Component Analysis (NCA) and Locality Sensitive Discriminant Analysis (LSDA). It is observed that these local methods can generally perform much better than global methods.
A related topic includes multiple kernel learning, which has been studied extensively in the machine learning and vision community. The goal in multiple kernel learning is to learn a strong kernel by integrating multiple weak kernels. In healthcare applications, multiple patient-patient kernel matrices are combined into a strong kernel to assess patient similarity. However, the practical difficulty of multiple kernel learning includes the following: 1) multiple kernel learning is not easy to generalize to new data points. For example, the existing similarity kernel will not be able to handle new patient arrivals until the kernel is recomputed to capture the new patient. 2) The computation complexity for multiple kernel learning is prohibitively expensive, often requiring a computational complexity of O(N3), where N is the number of data points. These challenges significantly limit the practical value of multiple kernel learning.