Computers are increasingly used to perform various linguistic transformations, such as translation, transliteration, transcription, speech recognition, and speech synthesis. However, various algorithmic limitations often prevent exact transformations from being generated in all situations. Instead, an engine may generate a set of possible interpretations with associated probability values. One common technique for improving the accuracy of an automated linguistic transformation is to use a single transformation engine to generate a set of the most likely results (N-best list). These results are presented to a user, who is then prompted to select the correct result.
This technique is commonly employed, for example, in the context of speech-to-speech translation. For example, a user may select the most accurate transcription of an input sentence from the N-best list generated by an automatic speech recognition (ASR) engine for further processing by a machine translation (MT) engine into a target language. However, while this N-best ASR methodology is now widely adopted to improve word recognition accuracy, it can most often only marginally improve the sentence recognition accuracy, which is much more important for the enhancement of speech-to-speech translation accuracy.
Another problem with current N-best approaches is the dilemma of determining the number of N-best results to be shown to the users. Although an increased number of displayed results will increase the chances that the correct result will be among those shown, it will also increase the amount of time necessary for the user to parse the results and determine the best choice. This is particularly challenging when the correct sentence is not the in N-best list to be selected. In mere seconds, a user may need to choose an incorrect sentence from an N-best list, which may contain dozens or even hundreds of entries, that is closest to the correct sentence. Furthermore, the N-best results are typically generated based solely on posterior possibilities; user selections are not utilized to improve performance and accuracy.
Thus, there is a need for a technique for linguistic transformation that generates N-best lists with significantly improved accuracy, with both minimal length and the correct result included, and with significant adaptation capability when user feedbacks are provided. There is a need for an interactive user interface that renders result linguistic representations, such an N-best list, easier for a user to parse and process.