The present invention relates to exponential models. In particular, the present invention relates to adapting exponential models to specific data.
Exponential probability models include models such as Maximum Entropy models and Conditional Random Field (CRF) models. In Maximum Entropy models, it is common to have a set of features, which are indicator functions that have a value of one when the feature is present in a set of data and a value of zero when the feature is not present. A weighted sum of the features is exponentiated and normalized to form the maximum entropy probability.
Typically, the weights for the Maximum Entropy model are trained on a large set of training data. To avoid overtraining the weights (model), at least one technique of the prior art applies smoothing to preserve probability mass for unseen data.
Although using a large set of training data makes the Maximum Entropy model useful across a large set of input data, it also produces a Maximum Entropy model that is not optimized for specific types of input data.
Thus, it would be desirable to be able to adapt Maximum Entropy models that have been trained on a large set of training data to specific sets of expected data so that they may perform better with the expected data.