Systems that voice recognize input through voices of speakers, convert the input into text, and display the input content converted into text in meetings or the like have been developed. In such a system, pieces of input content of a speaker are displayed in time-series order, for example. Conventional technologies are described in Japanese Laid-open Patent Publication No. 2006-50500, for example.
However, the conventional system may provide poor usability. When the pieces of input content of a speaker are displayed in time-series order, for example, when a plurality of speakers speak, the pieces of input content of each of the speakers are disconnectedly displayed, which may make the pieces of input content difficult to understand and may make usability poor. When a speaker B speaks midway during the speech of a speaker A, for example, the input content of the speaker B is displayed midway during the input content of the speech of the speaker A, which makes the input content spoken by the speaker A difficult to understand.