In recent years, importance has been placed on improving the quality of an operator's telephone calls at call centers. The evaluation of the call quality of an operator is made by, for example, a supervisor controlling the operators. For example, a supervisor selects each one of the voice files of speeches from a graph (speech time period graph) indicating the speech time periods of the customer and the operator, and plays the selected voice files, to recognize the contents of each speech and evaluate the call quality of the operator. Furthermore, when a voice recognition process is not performed on the voice file, it is difficult to recognize the relationship between the contents of the speech of the operator and the plurality of voice files from the speech time period graph, and therefore it is difficult to predict the portion of the problem in the speech of the operator.
As described above, unless the voice file of the speech is played, the supervisor is unable to easily recognize the portion of the problem in the speech of the operator. In terms of selecting and playing the voice file of speech, for example, the following conventional technologies are known.
There is conventionally known a technology of playing the voice files of the question and the answer in coordination with each other (see patent document 1).
Furthermore, there is conventionally known a technology of playing a plurality of voice files without recognizing the contents of the voice files in advance (see patent document 2).    Patent document 1: Japanese Laid-Open Patent Publication No. 2006-171579    Patent document 2: Japanese Laid-Open Patent Publication No. 2007-58767
However, in the technology of patent document 1 disclosing the conventional technology of playing the voice files of the question and the answer in coordination with each other, the contents of the voice file are recognized, the structure of the question and answer is created, and then the selection and a request to play are received. Therefore, a technology of recognizing the contents of the voice files is to be prepared in advance, which leads to an expensive device.
Furthermore, in the technology of patent document 2 disclosing the conventional technology of playing a plurality of voice files without recognizing the contents of the voice files in advance, a voice file is created for each speech, and a plurality of voice files included in a specified time period are played. Therefore, the portion of the problem in the speech of the operator is not easily recognized.
As described above, the supervisor has been incapable of easily finding the portion of the problem in a call from the customer attended to by the operator. Furthermore, in order for the supervisor to appropriately supervise the operator, the supervisor is to recognize the portion of the problem in the operator's speech. However, unless the voice file of the speech is played, the supervisor does not know the specific contents of the call.
Furthermore, even when the voice file of the speech is played, the supervisor is to sequentially listen to the call from the beginning, because the supervisor is unaware of where the problem is in the call. This method is time-consuming and inefficient. Thus, there have been cases where the supervisor is unable to sufficiently confirm the contents of the call, and consequently incapable of giving appropriate supervision.