The present invention deals with natural language understanding. More specifically, the present invention deals with annotating training data for training a natural language understanding system.
Natural language understanding is a process by which a computer user can provide an input to a computer in a natural language (such as through a textual input or a speech input or through some other interaction with the computer). The computer processes that input and generates an understanding of the intentions that the user has expressed.
In order to train conventional natural language understanding systems, large amounts of annotated training data are required. Without adequate training data, the systems are inadequately trained and performance suffers.
However, in order to generate annotated training data, conventional systems rely on manual annotation. This suffers from a number of major drawbacks. Manual annotation can be expensive, time consuming, monotonous, and prone to error. In addition, even correcting annotations can be difficult. If the annotations are nearly correct, it is quite difficult to spot errors.