Technical Field
The present disclosure may generally relate to a speech processing apparatus, a speech processing method and a computer-readable medium.
Description of the Related Art
In some aspects, speech processing apparatuses that extract an acoustic feature representing individuality for identifying a speaker that has made speech and an acoustic feature representing a language conveyed by the speech from a speech signal may be known. In other aspects, speaker recognition apparatuses that presume a speaker from the speech signal using these acoustic features and language recognition apparatuses that presume a language from the speech signal using these acoustic features may be known.
In a speaker recognition apparatus that uses a speech processing apparatus of this type, the speech processing apparatus may evaluate a degree of similarity between an acoustic feature extracted from a speech signal and a speaker model expressing a speaker dependency of a tendency of appearance of the acoustic feature, and based on the evaluation, select a speaker. For example, the speaker recognition apparatus may select a speaker identified according to a speaker model evaluated as having a highest degree of similarity. In some instances, if a speech signal to be input to the speaker recognition apparatus lacks some type of sound or contains noise, distortion may occur in acoustic feature of the speech signal and a difference, thus, may occur between the acoustic feature and an acoustic feature belonging to the speaker model, which may result in decrease in accuracy of speaker recognition.
There may be a technique in which based on a characteristic of a speech signal input to a speaker recognition apparatus, a determination criterion for speaker recognition is adjusted, thereby suppressing decrease in accuracy of the speaker recognition.