This invention relates to scalable label recognition of linguistic input, and more particularly to scalable intent recognition of speech utterance user input.
There are many applications of automated interpretation of linguistic input in which an input must be mapped to one of a known set of labels or categories. One form of classification makes use of a fixed-length numerical vector representation of the linguistic input, for example, with each position in the vector representing a particular word, word sequence, or class of words. A parameterized classifier, for instance an Artificial Neural Network (ANN) that is parameterized by numerical weights, is configured using examples of the vector inputs and corresponding correct labels. This configuration generally optimizes the numerical parameters in a procedure often referred to as “learning” or “training.”
When a new label is to be introduced into the system, one approach is to augment the examples of the vector inputs and the corresponding correct labels with examples for the new label, and configuration procedure is repeated. However, such an approach is not possible without the additional examples, and may not be effective with only a limited number of examples. Therefore, there is a need to have a procedure that permits generalization of a configuration determined from an initial set of labels in order to introduce a new label with few if any training examples.
One application of a classification of a linguistic input is in a spoken understanding system in which a user may utter commands related to different application domains (referred to herein as “skills”) or related to different action or commands within a domain (referred to herein as “intents”). Skill classification and/or intent classification are examples where the set of labels may need to be expanded after an initial configuration of a classifier without necessarily having a suitable training examples available to reconfigure the system.