For many years, users interacted with computing devices using just keyboards and displays—users input information, such as commands, to the devices via keyboards and the devices output information to users visually via the displays, largely in the form of text and limited graphics. The advent of the computing mouse enabled users to provide a different type of input to computers. Namely, the mouse allowed users to quickly and intuitively change a position of a pointer within a graphical user interface (GUI) and to ‘click’ buttons of the mouse once or in rapid succession to provide additional input relative to the pointer position. To accommodate widespread adoption of the mouse, GUIs also changed. For instance, GUIs changed to include objects of varying size capable of being displayed at different positions and relative to which a click or clicks initiates some action, e.g., to launch an application, make a selection within an application, cause display of another GUI component such as a menu, close a GUI component such as a window, and so on.
Like computing technologies generally, the development of user interfaces continues to evolve. There has been much work, for example, developing “natural user interfaces” (NUIs). Broadly speaking, NUIs are systems that enable user-computer interactions through intuitive actions related to natural, everyday human behavior. Some examples of NUIs include touch interfaces that let users interact with controls and applications more intuitively than cursor based interfaces because the touch interfaces are more direct, gesture recognition systems that track motions of users and translate those motions into instructions, gaze-tracking interfaces that allow users to guide a system through eye movements, and voice user interfaces (VUIs) that allow users to interact with a system through spoken commands. With regard to VUIs in particular, many conventionally configured speech recognition techniques, that are used to implement VUIs, fail to accurately recognize the words spoken by users. This inaccuracy can lead to confusion regarding the words actually spoken (e.g., in cases where a user is shown text indicative of a speaking user's words) and ultimately user frustration with VUIs. Due to such frustration, users may simply avoid interacting with systems via VUIs.