Speaker recognition consists of two major tasks: speaker identification and speaker verification. Speaker identification and verification have gained significant interest in speech technology and continue to grow. In speaker identification, the goal is to find the closest speaker in a data set to the unknown speaker. In speaker verification, an unknown speaker asserts an identity, and the task is to verify if this assertion is true whether the unknown speaker is in the data. This essentially comes down to comparing two speech data and deciding if they are spoken by the same speaker.
Recently, considerable progress has been made on speaker recognition using deep learning systems. In most of these systems, speaker recognition is performed by constructing a single neural network encoding data from multiple speakers. Output of the neural network is either the probability of correct authentication or direct classification of a specific user. The accuracy of the system depends of the size of the training data set of a specific user and how many users are in the data set (more data improve the performance of the system). These systems, however, are very large and must be trained on all speaker data and therefore re-trained when adding a new speaker. As a result, these prior art systems are relatively slow and difficult to update with new speakers. There is therefore a need for a new system that can be trained, implemented, and altered very quickly and easily.