In recent years, machine-learning approaches for data analysis have been widely explored for recognizing patterns which, in turn, allow extraction of significant information contained within a large data set that may also include data consisting of nothing more than irrelevant detail. Learning machines comprise algorithms that may be trained to generalize using data with known outcomes. Trained learning machine algorithms may then be applied to predict the outcome in cases of unknown outcome, i.e., to classify the data according to learned patterns. Machine-learning approaches, which include neural networks, hidden Markov models, belief networks and kernel-based classifiers such as support vector machines, are ideally suited for domains characterized by the existence of large amounts of data, noisy patterns and the absence of general theories. Support vector machines are disclosed in U.S. Pat. Nos. 6,128,608 and 6,157,921, both of which are assigned to the assignee of the present application and are incorporated herein by reference.
Many successful approaches to pattern classification, regression, clustering, and novelty detection problems rely on kernels for determining the similarity of a pair of patterns. These kernels are usually defined for patterns that can be represented as a vector of real numbers. For example, the linear kernels, radial basis function kernels, and polynomial kernels all measure the similarity of a pair of real vectors. Such kernels are appropriate when the patterns are best represented in this way, as a sequence of real numbers. The choice of a kernel corresponds to the choice of representation of the data in a feature space. In many applications, the patterns have a greater degree of structure. This structure can be exploited to improve the performance of the learning system. Examples of the types of structured data that commonly occur in machine learning applications are strings, such as DNA sequences, and documents; trees, such as parse trees used in natural language processing; graphs, such as web sites or chemical molecules; signals, such as ECG signals and microarray expression profiles; spectra; images; spatio-temporal data; and relational data, among others.
For structural objects, kernels methods are often applied by first finding a mapping from the structured objects to a vector of real numbers. In one embodiment of the kernel selection method, the invention described herein provides an alternative approach which may be applied to the selection of kernels which may be used for structured objects.
Many problems in bioinformatics, chemistry and other industrial processes involve the measurement of some features of samples by means of an apparatus whose operation is subject to fluctuations, influenced for instance by the details of the preparation of the measurement, or by environmental conditions such as temperature. For example, analytical instruments that rely on scintillation crystals for detecting radiation are known to be temperature sensitive, with additional noise or signal drift occurring if the crystals are not maintained within a specified temperature range. Data recorded using such measurement devices is subject to problems in subsequent processing, which can include machine learning methods. Therefore, it is desirable to provide automated ways of dealing with such data.
In certain classification tasks, there is a priori knowledge about the invariances related to the task. For instance, in image classification, it is known that the label of a given image should not change after a small translation or rotation. In a second embodiment of the method for selecting kernels for kernel machines, to improve performance, prior knowledge such as known transformation invariances or noise distribution is incorporated in the kernels. A technique is disclosed in which noise is represented as vectors in a feature space. The noise vectors are taken into account in constructing invariant kernel classifiers to obtain a decision function that is invariant with respect to the noise.