Voice processing such as voice recognition or voice synthesis needs to collect a large amount of voices used for, for example, learning or evaluation. As one of the ways to collect voices, a system may be structured that collects voices from a large number of operators through the Internet and rewards the operators in return for their work. For example, JP-A 2003-186489 discloses a voice collection system that enables an utterer to perform recording by himself or herself by displaying character strings to be uttered and direction information to the utterer. Such a system can collect a large number of voices with a low cost in terms of time and economy.
In such a system, an operator performs recording work while reading aloud presented text at the operator's discretion. Thus, the operator may transmit a voice without reading aloud the text again when the operator fails to read aloud the text, and the poor quality voice that does not coincide with the text may be collected in the system. The use of the voices including a large number of such poor quality voices due to mistakes in reading aloud causes accuracy in voice processing to deteriorate.