The present invention relates to user interface for a device which provides text to speech synthesis.
The synthesis of human speech using electronic devices is a well developed and published technology and various commercial products are available. Typically speech synthesis programs convert written input to spoken output by automatically generating synthetic speech and speech synthesis is therefore often referred to as xe2x80x9ctext-to-speechxe2x80x9d conversion (TTS).
There are several problems in speech synthesis which, as yet, have not been satisfactorily resolved. One problem is the difficulty in comprehension of the synthetic speech by a user. This problem may be exacerbated in mobile electronic devices such as mobile telephones or pagers which may have limited processing resources.
It would be desirable to improve the level of comprehension a user has of the speech output from such speech synthesiser systems.
According to one aspect of the present invention, there is provided an electronic device comprising a speech synthesizer including a loudspeaker, arranged to convert an input dependent upon punctuated text, to an audio output representative of a human vocally reproducing the text; a user input device for inputting instructions to navigate through text, between positions defined by punctuation identifiers of the text, to a desired position; and a controller arranged to control navigation to the desired position and provide the speech synthesizer with an input corresponding to a portion of the text from the desired position, in response to input navigation instructions.
Such a device provides the user with a means for navigating through text thereby selecting desired portions to be output audibly by the speech synthesizer. Further, since the navigation is between punctuation identifiers, the portions of text are split logically, enabling the user to put individual words into context more easily. Thus, the intelligibility of the audio output by the user is improved.
The punctuation identifiers may be punctuation marks provided in the text, and/or other markers. The electronic device may use punctuation identifiers which identify the beginning of sentences, such as a full-stop (period), exclamation mark, question mark, capital letter, consecutive spaces. Alternatively, the punctuation identifiers may be marks such as a comma, colon, semi-colon, or dash which are also used to separate words in text into logical units. Also, the input text can include special characters for this purpose. The creator of the text may, for example, use special characters to mark words which may be difficult and thus need to be replayed, when he foresees intelligibility problems.
The electronic device may comprise a display for presenting a text portion which the user can refer to confirm the user""s understanding of the audio output.
The device may be arranged to navigate backwards through the text, thereby providing a function for repeating a portion of text. The device may respond to a repeat or backwards command input by a user, by the controller navigating backwards to a position defined by a predetermined punctuation identifier so as to repeat the portion of text from that position.
The predetermined punctuation identifier may be the first punctuation identifier in the backwards sequence or alternatively a second or further punctuation identifier in the backwards sequence. However, preferably the navigation depends on how quickly the repeat command is made after the audio output corresponding to the first punctuation identifier in the backwards sequence. According to such an embodiment, the device may determine this based on the length of text and/or the length of time for audible reproduction of the text between the current position and the position defined by the first punctuation identifier in the backwards sequence. If the length is below a threshold (such as five words, for example, or two seconds), the controller is arranged to navigate backwards to a position defined by the second punctuation identifier in the backward sequence.
The speech synthesizer may repeat the text more slowly than a default speed. This has the advantage of further improving the comprehensibility of the repeated synthesized speech. If the device comprises a display, the default speed may be that of the display of text on the display. Alternatively, the default speed may be the normal speed of the output by the speech synthesizer.
Alternatively, or in addition to the backward navigation, the device may be arranged to navigate forwards through the text. In this way, it can jump forwards past a portion of the text. The device responds to a forward or skip command input by a user, by the controller navigating forwards to a position defined by a predetermined punctuation identifier, so as to skip the portion of text between the current position and that position. In other words, it jumps to provide an audio output from the position defined by that predetermined punctuation identifier.
The predetermined punctuation identifier may be the first punctuation identifier in the forward sequence or alternatively a second, or a further, punctuation identifier in the forward sequence. However, preferably the navigation depends on how soon the audio output corresponding to the next punctuation identifier would occur in the absence of the skip command. According to such an embodiment, the device may determine this based on the length of text and/or the length of time for audible reproduction of the text between the current position and the position defined by the first punctuation identifier in the forward sequence. If the length is below a threshold, the controller is arranged to navigate forwards to a position defined by a second punctuation identifier in the forward sequence.
There are a number of ways in which a user can input instructions. In one embodiment, the user may input instructions via a user input comprising a key means. The key means may be a user actuable device such as a key, a touch screen of the display, a joystick or the like, The key means may comprise a dedicated instruction device. If the device provides for forward and backward navigation, then it may comprise separate dedicated navigation instruction devices. That is, one for forward navigation, and one for backward navigation.
The control means may determine the number of device actuations and determine the position of the punctuation identifier associated with that number of actuations. For example, pressing the dedicated key associated with backward navigation instruction two times could cause the device to navigate to a position of the punctuation identifier two back.
Alternatively, the position of punctuation identifier may be determined on the length of time the dedicated key is depressed.
Alternatively, the key means may comprise a multi-function key. One function of this key is selecting a navigation instruction. The navigation instruction itself may be provided by the user inputting it, or via a menu option. In either case, the multi-function key is used to select the navigation instruction.
Instead of, or in addition to the key means, the user input device may comprise a voice recognition device. Such a voice recognition device typically provides navigation instructions by way of a voice command.
The electronic device may be a document reader, a portable communications device, a handheld communications device, or the like.
According to another aspect of the present invention there is provided a portable radio communications device comprising a speech synthesizer including a loudspeaker, arranged to convert an input dependent upon punctuated text, to an audio output representative of a human vocally reproducing the text; a user input device for inputting instructions to navigate through text, between positions defined by punctuation identifiers of the text, to a desired position; and a controller arranged to control navigation to the desired position and provide the speech synthesizer with an input corresponding to a portion of the text from the desired position, in response to input navigation instructions.
The device may further comprise means for mounting in a vehicle.
According to a further aspect of the invention, there is provided a document reader comprising a speech synthesizer including a loudspeaker, arranged to convert an input dependent upon punctuated text, to an audio output representative of a human vocally reproducing the text; a user input device for inputting instructions to navigate through text, between positions defined by punctuation identifiers of the text, to a desired position; and a controller arranged to control navigation to the desired position and provide the speech synthesizer with an input corresponding to a portion of the text from the desired position, in response to input navigation instructions.
These devices may be provided in a car. If so, and if the device comprises key means, these are preferably provided on the steering wheel of the car.
According to yet another aspect of the present invention there is provided a method of navigating through text to a desired position for audio output by a speech synthesizer, the method comprising detecting instructions input by a user to navigate through text, between positions defined by punctuation identifiers of the text, to a desired position; controlling navigation to the desired position; and providing the speech synthesizer with an input corresponding to a portion of the text from the desired position.
According to a still further aspect of the present invention there is provided a method for providing speech synthesis of a desired portion of text, the method comprising determining a desired start position from a selection defined by punctuation identifiers, from an instruction input by a user; moving to the desired start position; outputting speech synthesized text from that position.