1. Technical Field
The present invention relates generally to data recognition systems and, more particularly, to a system and method for automatically assuring the quality of user enrollment in a recognition system.
2. Description of Related Art
In general, user enrollment is a process in which training data is collected from a user (e.g., acoustic utterances of spoken words in a speech recognition system and handwriting data in a handwritten text recognition system) for the purpose of training a recognition system to recognize user-specific input data (e.g., spoken words and handwriting). In general, the enrollment process comprises two key steps: (1) collecting training data from a user seeking enrollment in the recognition system; and (2) processing the input training data to create a user-dependent prototype (e.g., a user-specific statistical model used for decoding user-specific data). To decode input data from non-enrolled users, a recognition system will apply a user-independent prototype (e.g., a user-independent statistical model which is trained on data from a random population of individuals).
Typically, user enrollment provides improved recognition performance for user-specific data. For instance, higher decoding accuracies and faster recognition speeds may be obtained when decoding user-specific data with a corresponding user-dependent prototype as compared to the decoding results and processing speeds obtained when decoding the user-specific data with a user-independent prototype. In some cases, however, enrollment can provide degraded recognition performance. For example, less than optimal decoding accuracy may be obtained in a speech recognition system if the user speaks markedly different during enrollment from the way the user typically speaks during normal use of the recognition system, or if user enrollment is performed in an environment which is markedly different than the environment in which the recognition system is normally used. In these situations, increased recognition accuracy may be obtained by decoding the user""s spoken words with the recognition system""s speaker-independent prototype (as compared to using the speaker-dependent prototype).
There are other situations in which enrollment may result in degraded system performance. For instance, degraded system performance may be realized when xe2x80x9cunsupervised enrollmentxe2x80x9d is performed. Unsupervised enrollment refers to the process of collecting whatever training data the user desires to provide. For instance, with unsupervised enrollment in speech recognition, the system will record input acoustic utterances of a user""s random dictation. If the recognition system does not check (or if the user is not afforded the opportunity to determine) the accuracy of the speaker-dependent model, however, unsupervised enrollment may result in degraded performance in situations where the acoustic data collected for a specific training period is poor. On the other hand, in a xe2x80x9csupervised enrollmentxe2x80x9d, the user will recite from a predetermined text (scripted text). Since the recognition system has a priori knowledge of the recited text, supervised enrollment ensures the closest possible match between what the user spoke and what the system assumed the user spoke.
Degraded recognition performance may also result for various reasons when the speech recognition system is embedded in a small device such as a personal digital assistant (PDA) as opposed to a desktop system. For example, memory restrictions of the PDA typically limit the amount of enrollment data that can be stored and utilized for training user-dependent prototypes. In addition, many small PDA devices have either a small display or no display at all, which makes user interaction with the PDA during enrollment difficult. These limitations may increase the chances that an enrollment will produce degraded rather than enhanced performance.
Conventional recognition systems are not configured to automatically assess (or otherwise allow the user to determine) the quality of a user enrollment. Unfortunately, if the enrollment process produces a user-dependent prototype that produces degraded performance (in comparison to the performance of a currently employed prototype), the user will be subsequently faced with decreased recognition accuracy. Therefore, there is a need for a recognition system that will automatically assure the quality of user enrollment before replacing a current prototype with a newly trained user-dependent prototype.
The present application is directed to a system and method for providing automatic quality assurance of a user enrollment in a recognition system. Advantageously, the present invention checks the quality of a new enrollment (i.e., a newly trained user-dependent prototype) before it is accepted in place of a current enrollment. This quality check is performed by decoding stored test data (collected from the user via known scripted text) using the new enrollment, comparing the decoding results of the new enrollment to the known script used to generate the test data to obtain an accuracy score for the new enrollment, and then comparing the accuracy score for the new enrollment with an accuracy score of a previous qualified enrollment (or, in the case where there is no previous, qualified enrollment, to the accuracy of the speaker independent model). If the decoding results of the new enrollment are acceptable, the new enrollment will be used for recognition; otherwise it will be rejected and discarded.
In one aspect of the present invention, a method for assuring the quality of user enrollment in a recognition system comprises the steps of:
training a new user-dependent prototype;
computing an accuracy score for the new user-dependent prototype;
determining if the new user-dependent prototype is acceptable by comparing the computed accuracy score for the new user-dependent prototype with a previously computed accuracy score for a current user-dependent prototype;
applying the new user-dependent prototype for recognition if the new user-dependent prototype is deemed acceptable.
In another aspect of the present invention, the recognition system may be configured to store one or more previous enrollments so, at the user""s discretion, it is possible to return to previous enrollments.
These and other aspects, features and advantages of the present invention will be described and become apparent from the following detailed description of embodiments, which is to be read in connection with the accompanying drawings.