The present invention relates generally to the field of natural language processing, and in particular to developing training data for training a natural language classifier.
Chatbots, talkbots, instant messaging bots, artificial conversational entities, and the like, (“chatbots”) are computer software programs designed to simulate natural language communication, conversation, and dialogue with humans and end-users. An important function of a chatbot may include understanding, interpreting, and determining an expressed intent of user input from an end-user. The expressed intent may include, for example, free-form text and/or utterances. The chatbot may support and facilitate natural language communications with the end-user by correctly determining the expressed intent of the user input, for example, to correctly and appropriately generate meaningful, helpful, or otherwise desired output in response to the user input. That is, the chatbot may generate desired outputs by correctly determining expressed intents of user inputs. For example, a chatbot capable of sufficiently determining variously expressed intents may be applied in supporting and facilitating natural language communications with a wide range of end-users so as to respond to queries, answer questions, and provide information in response to requests, such as “I want to change the password of my system,” or “I did a mistake registering my phone number, I need to correct it.” Such intents may be expressed in many different ways by the end-users.
The chatbot may determine an expressed intent by implementing a natural language classifier (NLC) to disambiguate, understand, and interpret the expressed intent, such as by way of text classification. The NLC may be implemented in classifying, categorizing, or grouping the expressed intent with respect to a corresponding class or set of corresponding expressions. The classification may be performed with a degree of confidence represented by a confidence score. The degree to which the chatbot may correctly determine the expressed intent may depend on an accuracy of the classification by the NLC. The confidence score may reflect the accuracy or level of certainty by which the classification of the intent may be determined, which may depend on a level and granularity of understanding and interpretation (“natural language comprehension”) by and of the NLC of the expressed intent.