In an existing technology of detecting errors of the training data, first, a feature is generated from initial training data and a training model is generated using machine learning. Training data candidates are generated by automatically attaching tags to a raw corpus using the training model. Then, a reliability of the training data candidates is calculated to select and provide training data candidates to a user. When the user corrects errors of the training data candidates through a graphic user interface and adds the corrected training data candidates to the training data, a new training model is generated from the newly generated training data. This new training model is used to estimate an answer using a voting together with the existing training model. By repeating the above-described process, the accuracy of automatic tagging is gradually increased and the training data are enhanced.
As mentioned above, the existing technology of detecting errors of training data provides a method of additionally establishing the training data using the initial training data, but cannot determine errors of the initial training data.