A conventional technology of converting speech uttered by a speaker to a text in a speech recognition process, and outputting the converted text as a caption has been known.
In the technology as described above, for correcting a recognition error in the speech recognition process, a technology of outputting a caption in which a character string of a recognition error portion is corrected into a correct character string has been known. This is possible because a corrector manually selects the recognition error portion on the text that is converted from speech, and manually inputs the correct character string for correcting the character string of the selected recognition error portion from a keyboard, for example.
However, in the conventional technology as described above, to correct the recognition error, the recognition error portion needs to be manually selected and the character string in the recognition error portion needs to be manually corrected. Thus, the correction of the recognition error has been troublesome.