1. Field of the Invention
The present invention relates to a text information display apparatus equipped with a speech synthesis function having a function of converting items being displayed from text into speech, a speech synthesis method of the same, and a speech synthesis program.
2. Description of the Related Art
In recent years, as mobile terminals, mobile phones speaking aloud the names of functions etc. set by key operations corresponding to key operations have been proposed (see for example Japanese Patent Publication (A) No. 11-252216). Such a mobile phone has a plurality of key operation units, a controller for setting a function corresponding to one or more key operations of the key operation units among a plurality of functions provided in the phone, and a speech synthesizer for outputting by speech the name of the function set linked with the key operations.
Further, as a system employing the speech output function, an e-mail system enabling a sender to select the speech quality to be used for converting text to speech at the receiving side when sending text by e-mail has been proposed (see for example Japanese Patent Publication (A) No. 2004-185055).
In a mobile terminal having the above text-to-speech conversion function, the function is realized by notifying the text to the engine (controller and speech synthesizer) for conversion to speech.
However, the Internet or other installed browsers will notify display information for displaying text to the mobile terminal side, but will not notify the actual text for conversion to speech. The display information is notified with the text divided into small sections, so cannot be notified to the text-to-speech engine as it is. Further, a sequence of notification of the text will not always be from the top of the display, therefore if converting the text to speech in the sequence of notification, a suitable sentence will not be obtained. Further, according to a style of the display, even text on the same row may be notified with deviated coordinate values, therefore will not be able to be treated as text on the same row.
Further, in many content, the user depresses a link in order to change screens. For this reason, many links are arranged in content in actual circumstances. Accordingly, it is necessary to make the user recognize the link by text-to-speech conversion and, at the same time, notify the correct depression of the link to the user by the text-to-speech conversion. Namely, a linked portion cannot be clearly notified by speech, so it is difficult to easily recognize a shift from the link.
Further, it is known to modify the browser side and add a text-to-speech interface to realize text-to-speech conversion, but even in this case, general sites (HTML etc.) cannot be displayed. Only specific sites can actually be handled.