Many systems allow users to enter queries and/or questions (hereafter referred to collectively as questions) into a system for a variety of purposes, such as search, providing instructions to a digital assistant, interacting with a chat bot, and so forth. As the user types in the question, the system can monitor the keystrokes and provide a list of suggestions on how the question can be completed. When the user sees a question they are trying to type and/or that they like, they can scroll down and press ‘enter’ or otherwise indicate that is the question they would like to submit. The goal of autocomplete systems is to eliminate work on the part of the user (i.e., prevent the user from having to type in the complete question).
Modern autosuggest or autocomplete systems are built for a task-specific style of language, such as web search that differs from natural language, and typically rely on having seen millions of examples of previous queries. These memorized queries are retrieved using a matching process as the user types, predicting the next word or phrase based on the probabilities predicted by the memorized transactions.
Modern autocomplete systems typically use a trie data structure for extreme efficiency because the suggestions must be returned in a few milliseconds to avoid disrupting the user's typing pattern. (Suggestions should not be shown substantially slower than user keystroke speed.) A trie is a densely packed tree data structure where each vertex node in the tree contains entries. For query autocomplete purposes, the path to the node is encoded as the series of characters of the query prefix. This approach relies on memorization of past queries that can be stored in the trie, and thus cannot make suggestions when query prefixes are not already stored in the trie.
A variant approach to this one supports simple rotations of terms from the prefix that do have entries in a trie. This extends the coverage beyond exact query prefix matching, but still relies on known queries being stored in the trie. Due to performance considerations, it is not possible to consider all possible permutations of the prefix.
It is within this context that the present embodiments arise.