Intelligent automated assistants (or virtual assistants) provide an intuitive interface between users and electronic devices. These assistants can allow users to interact with devices or systems using natural language in spoken and/or text forms. For example, a user can access the services of an electronic device by providing a spoken user input in natural language form to a virtual assistant associated with the electronic device. The virtual assistant can perform natural language processing on the spoken user input to infer the user's intent and operationalize the user's intent into tasks. The tasks can then be performed by executing one or more functions of the electronic device, and, in some examples, a relevant output can be returned to the user in natural language form.
While mobile telephones (e.g., smartphones), tablet computers, and the like have benefited from virtual assistant control, many other user devices lack such convenient control mechanisms. For example, user interactions with media control devices (e.g., televisions, television set-top boxes, cable boxes, gaming devices, streaming media devices, digital video recorders, etc.) can be complicated and difficult to learn. Moreover, with the growing sources of media available through such devices (e.g., over-the-air TV, subscription TV service, streaming video services, cable on-demand video services, web-based video services, etc.), it can be cumbersome or even overwhelming for some users to find desired media content to consume. In addition, coarse time-shifting and cue controls can make it difficult for users to obtain desired content, such as specific moments in a television program. Obtaining timely information associated with live media content can also be challenging. As a result, many media control devices can provide an inferior user experience that can be frustrating for many users.