Conversational interfaces are known in the art. For example, certain prior art mobile devices offer a conversational interface that allows the user to request information using a spoken, natural language command. In another area of prior art, customer service telephone systems often will allow a customer to request information from a server verbally over the phone or textually using a chat window or other device, again using natural language commands. These types of conversational interfaces involve a word recognition feature, where the words that were spoken or written by a person are determined, and an intent classification feature, where the meaning of the words and the intent of the person are determined. For instance, if a user says or writes “Tell me the weather,” the receiving system needs to recognize which words were uttered, and then it needs to determine that the user was asking for information about the day's weather. After determining intent, the prior art systems obtain the requested information and provide it to the user, sometimes using synthesized speech.
These prior art conversational interfaces often rely on supervised machine learning models to perform various natural language understanding operations to determine intent. These models help classify a user's intent (e.g., what they want the system to do), as well as extracted entities (e.g., proper nouns) that make up the parameters a user wishes to perform an action against. These models rely heavily on understanding or capturing the vocabulary of the target domain to produce accurate predictions, and they typically require a library containing the entire vocabulary that might conceivably be uttered by a user.
The prior art lacks any conversational interfaces for use in cyber security environments. One reason for this is that closed-domains, such as cyber security, involve technical jargon and a nearly infinite number of proper nouns to capture (e.g., file names, MD5 hashes, IP address). For example, in a typical prior art cyber security environment, a user might type, “search process data for b58e841296be1e7a8c682622339e0cc4” to search for an MD5 hash against process data. A prior art intent classifier, if used in this context, would have difficulty predicting the correct label to use for “b58e841296be1e7a8c682622339e0cc4” because that term would not be in its vocabulary. Capturing highly diverse vocabularies highlight challenges in building performant classifiers. Attempting to capture this nomenclature in a single language model leads to extremely large models that do not generalize well outside the training environment. The resulting model fails to produce the performance (e.g. accuracy) required in a production setting and is often abandoned for a regex or direct matching solution.
What is needed is an improved conversational interface engine that is able to accurately determine a user's intent in a closed-domain environment where the user's utterance potentially could contain one or more instances of a near-infinite number of different terms.