When a meeting with foreigners is held, a known system performs speech recognition for spoken words of a participant in the meeting, converts the recognition result of the spoken words into a text, and displays the text on a screen as a subtitle. Another known system outputs what the speaker emphasizes as being visually recognized, and still another known system displays words which are considered hard for the user to recognize.