Often, controlling a set top box (STB) that operates a media device or system using a remote control is difficult or inconvenient. For example, a parent holding an infant may have difficulty in reaching for and using the remote control. As another example, a person eating dinner while watching a television (TV) may have difficulty in reaching for and using the remote control. Further, in some instances, a person may have lost their remote control and would have to make changes to their media system using the manual controls on the STB and/or the media device or system.
In some situations, it may be physically impossible for a person to operate a remote control. For example, a person with severe physical disabilities may not have sufficient control of their hands and/or fingers to manually operate the remote control. As another example, a person in a hospital recovering from surgery may not be able to reach and/or operate the remote control.
One possible solution to the above-described problem of not being able to operate a remote control is the use of a speech or voice recognition technology. However, media devices typically present sounds to the user. For example, the user may be listening to music presented on their radio or stereo. As another example, the media device may be presenting both video images and sounds, such as when a user is viewing a movie. Accordingly, the sounds emitted from the media device must be distinguished from verbal commands of the user. In many situations, distinguishing between sounds emitted from the media device and the verbal commands of the user renders such speech or voice recognition systems inefficient or even inoperable.