1. Field of the Invention
This invention relates to broadcast systems using text-to-speech (TTS) conversion.
2. Description of the Prior Art
The invention is applicable to broadcast transmission and to various types of broadcast signal receiver, such as a television receiver or a mobile telephone handset. A problem will be described below in the context of television receivers merely in order to explain the technical background of the invention.
Television receivers have been proposed which make use of TTS conversion to assist blind or partially-sighted users. Two examples are disclosed in GB-A-2 405 018 and GB-A-2 395 388. In these examples, TTS techniques are used to reproduce data such as electronic programme guide (EPG) data and teletext data in an audible form.
EPG data in this context means programme listings provided in advance by the broadcaster, to allow a user to select a programme for viewing and/or recording, and data defining a current and a next programme being broadcast on a particular channel. Teletext data refers to textual data provided by the broadcaster as part of an information service. Examples of teletext data might include pages of news text, weather information, cinema listings and the like. All of these data have features in common: they are normally made available to the user by displaying the text on the television screen, and in practical terms they have an unlimited lexicon (vocabulary; set of available words). It is this feature of an unlimited lexicon can cause difficulties for a TTS system.
TTS techniques rely either on replaying pre-recorded voices relating to the words to be converted into speech by the TTS device, or by building full words from sub-elements of pronunciation known as phonemes. Phonemes are the basic units of speech sound, and basically represent the smallest phonetic units in a language that are capable of expressing a difference in meaning. TTS systems use sets of rules to generate successions of phonemes from the spellings of words to be converted into speech. In languages such as English, which contain many irregular pronunciations, these rules can be complex, especially when similar spellings have different pronunciations (for example: the set of characters “ough” in the English words “through”, “though”, “cough”, “rough”, “plough”, “ought”, “borough”, “lough” etc, all of which have different pronunciations of those four characters). But despite these complications, TTS systems based on phonemes or on pre-recorded voices are generally arranged to cope with the complexities of words that are known in advance to the system designers.
However, it is practically impossible to predict in advance what words will appear in EPG data, teletext data and the like. For example, a broadcaster may introduce an abbreviation (for example “Spts” for a “sports” channel). In another example, a name of a programme presenter or a personality in the news may move into common use but might not normally have been included in the lexicon of a TTS system—for example “George Papandreou”, “Lembit Opik”, “Albus Dumbledore”.
The Adobe® Captivate 4 TTS system provides the facility to customise TTS pronunciations, by the user rewriting a difficult-to-pronounce word in a more phonetic form which the TTS system can recognise and pronounce. But in the context of TTS conversion of EPG or teletext data, this arrangement would be of little use to a phoneme-based TTS system. Firstly, the EPG or teletext data is transient; the user might access it once only, and so the user would not choose to spend time designing and entering a replacement phonetic spelling to assist the TTS system. Secondly, the user might not even know how a particular word—for example an abbreviation such as “Spts”—should be pronounced. Thirdly, in a system aimed at the partially sighted or blind user, it would be an undue burden to expect the user to retype replacement phonetic spellings.
The arrangement of Adobe Captivate 4 is not relevant to a TTS system based on pre-recorded pronunciations.