A typical user has a multitude of communication devices available to him for accessing the Internet to potentially satisfy various needs for information or content. These communication devices include (but are not limited to) desktops, laptops, smartphones, tablets, just to name a few. Each of these devices is associated with a user interface for enabling the user to input information into the communication device and to receive the information from the communication device.
For example, a laptop may be associated with a keyboard and a track tablet to enable the user to input information into the laptop and a screen to enable the user to obtain information from the laptop. By the same token, a typical tablet may have a unified input/output interface, such as a touchscreen. The touchscreen typically provides the user with a virtual keyboard to enable the user to input information into the tablet by touching a respective area on the touchscreen. The touchscreen is also used to output information to the user. Some smartphones also use the touchscreen as means for input and output of information by/to the user. Some smartphones have both the physical keyboard and a touchscreen to provide the user a choice of means to enter the information into the smartphone.
There can be numerous circumstances when it is not convenient for the user to use the typical input/output means. For example, when the user is driving and is desirous of interacting with his smartphone, using the touchscreen may not be the most convenient (or even the safest) mode of interacting with the smartphone. As such, several technologies for voice entry and output of information have been proposed. For example, it is known to implement a solution, where the user can interact with the communication device using voice (i.e. spoken) commands. The electronic device then performs a speech to text recognition and executes the user command.
For example, the user may say: “Shuffle all songs” to his iPhone™ smartphone (using the SIRI™ application. Responsive to the receipt of such a command, the smartphone is operable to play music stored therein in a random manner (i.e. to randomly select songs or “shuffle” the available songs). It is also known for a given electronic device to output information via a synthesized voice interface. For example, it is known for the given electronic device to execute a text to speech conversion, to “read out” a content of an e-mail (or another electronic document) to the user.
Some solutions enable the user to fully interact with the electronic device using voice. For example, the Siri™ application for iPhone™ smartphone enables the user to interact with several applications of the smartphone using voice. For example, upon user activating the Siri application, the smartphone is entered into a “listening” mode, where the application is awaiting a voice-based command from the user. Some of the known interactions may include the following examples.
A given user may have entered a voice-command: “Set an appointment for 11:30 am”. Responsive to such a command, the Siri application responds by uttering using a synthetic voice: “I will set up an appointment for tomorrow at 11:30 am. Shall I schedule it?” The user can then confirm the appointment and the application will create an entry into the Calendar application for 11:30 am.
Another user may have entered a voice-command: “When is my next meeting?” Responsive to this voice-command, the application provides a voice-response: “Your next meeting is at 14:30 today”.
Voice recognition based interfaces provide a multitude of problems that the industry has tried to solve—some to a better degree than the others. For example, managing ambient noise is one of the concerns when performing voice recognition. Another concern is dealing with each user's accent, especially without proper “machine training” routines traditionally used on desktop applications and not used on mobile applications.