With globalization of culture and economy in recent years, machine translation apparatuses have raised expectations for supporting communications between people speaking different languages. For now, application software of speech translation which is operated on mobile terminals (for example, smartphones), and Web-services which provide function of speech translation are in operation.
These speech translation systems are classified to the following two groups according to user's communication styles. The first system is a speech-exchange-type system in which a source language user speaks and a target language user hears the translation result. The second system is a display-combined-use-type system in which users confirm displays to read the recognized texts and the translated texts in order to check whether the user's speech is correctly processed, and then speak dialogues.
Unfortunately, it is impossible using available technology to perform speech recognition and machine translation without an error. Then, some feedback functions are needed. The feedback function shows users the recognition results and translation results which are not necessarily exact, since the users restate clearly, guess the intention of the other user's dialogue, and question.
Therefore, when users can see displays, the display-combined-use-type speech translation system is more reliable for the users, than the speech-exchange-type.
The display-combined-use-type speech translation systems are further classified to the following two groups according to user's browsing style in which users see what kind of size displays with whom.
The first system is a display-share-type system in which the users see the same display of one terminal device together and speak dialogues. The second system is an individual-screen-type system in which each user sees each display in respective terminals and speaks dialogues.
The problem with the display-share-type speech translation system is that, if some user's terminal device is shared by the other user, it is difficult for the other user to operate the terminal device.
Consider the case where the staff of a store and a foreign visitor who comes to the store, have dialogues, using the display-share-type simultaneous speech translation system (for example, tablet computer). The staff is experienced in operation of the tablet computer. But the first time visitor is not experienced in the operation. So it is difficult for the visitor to operate the tablet computer.
Similar problems exist in not only the operation of the display but also the way to input audio to a microphone. For example, the precision of the speech recognition is influenced by the volume of user's dialogue, the distance between a microphone and user's mouth, and the way in which a microphone is held by user's hand. Therefore, if the user is not experienced in using a microphone, the preciseness of speech recognition is likely to get worse.
In the above case, the staffs of the store can input their speeches to the tablet computer. On the other hand, it is difficult for the foreign visitors to input their speeches. Therefore, if the visitors have the terminal device (for example, smartphone) which they usually use, it is expected to realize improvement of preciseness in speech recognition in the system.
As explained above, Conventional display-share-type systems and individual-screen-type systems are not able to solve the above mentioned shortcomings.
In order to solve the above mentioned shortcomings, the speech translation system is required to consider (a) difference of display size in terminal devices, (b) a possibility of sharing the display of whether users see the same display together, and (c) an adaptability of user's experience of the speech input unit of the terminal device.
Especially recent years, the personal information terminal device having various forms (for example, smartphone and tablet computer) is quickly spreading. Thus solving above mentioned shortcomings is strongly desired in case of how various terminals are combined.