A speech recognition system outputs one or several results of speech recognition based on applications and recognition performance thereof. In general, a speech recognition system having excellent recognition performance uses a method of outputting single data having a likelihood with the highest distance with an input utterance. On the contrary, a speech recognition system having a poor recognition performance provides several output lists so that a user may select a correct answer.
As such, that a few output lists provided from several outputable lists are referred as N-best and the number of output lists is determined by specification and application of the system.
In an existing speech recognition system for providing N-best output lists, a user sees output results and determines whether the output result is a correct answer. That is, the existing system does not provide an utterance verification technique but entrusts a user to verify an utterance.
The N-best output lists are not provided as a vocabulary set similar to utterance data of the user and interaction relation between N-best output lists having no relations. On the other hand, the N-best output lists are provided as a vocabulary set similar to utterance data of the user and interaction relation between N-best output lists having relations.
The following Example 1 shows 10-best recognition results for a user's utterance “poongmin mok-yok-tang”.