1. Field of the Invention
This invention relates to a method and apparatus for preparing a document to be read by a text-to-speech reader. In particular the invention relates to classifying the text elements in a document according to voice types of a text-to-speech reader.
2. Description of the Related Art
In a number of different areas, such as voice access to the Internet, ‘reading’ textual information for the blind, and creating audio versions of newspapers, there is a significant problem in ensuring that appropriate attention can be drawn to the sections in a given document and the information they contain. One important attentional cue under such circumstances is a change of voice, for instance from male to female voice. In auditory terms, this has the effect of highlighting that something has changed in the informational content.
Machine-readable documents are a mixture of both mark-up tags, paragraph markers, page breakers, lists and the text itself. The text may further use tags or punctuation marks to provide fine detailed structure of emphasis, for instance, quotation marks and brackets or changing character weight to bold or italic. Furthermore, VoiceXML tags in a document describe how a spoken version should render the structural and informational content.
One example of such voice-type switching would be a VoiceXML home page with multiple windows and sections. Each window or section line or section of a dialogue may be explicitly identified as belonging to a specific voice.
A problem with VoiceXML pages is that the VoiceXML tags need to be inserted into a document by the document designer.
Previously, methods have highlighted grouping content together to drive voice-type selection on the basis of document structure alone. In this way, tables for example can be read out intelligently. However, such systems do not supplement this structuring with thematic information to complete the groupings or the better to select appropriate voice characteristics for output.