User interfaces for electronic and other devices are evolving to include speech-based inputs in a natural language such as English. A user may voice a command to control the operation of a device such as a smartphone, tablet computer, personal computer, appliance, television, robot and the like. Natural language processing, a type of machine learning using statistics, may be used to interpret and act upon speech inputs. Speech recognition may convert the input to text. The text may be analyzed for meaning to determine the command to be performed.
Processing speech inputs in a natural language may be difficult because speech commands may be ambiguous and require clarification. More than one speech input may be used or even required to complete a specific command. Thus, sequential speech inputs may be related to one specific command or to different commands.
When speaking to conventional speech recognition systems, users often feel the need to modify their natural way of speaking so that a machine may understand the user's intention. This can be cumbersome and annoying which may cause users to abandon such a system.