After technical development for many years, a speech assessment related to a text has entered into a practical stage. The speech assessment related to the text refers to that a certain text is read by a user, and a speech assessment system stores pronunciation data of the user, assesses the pronunciation data to give an assessment score,
In an existing speech assessment system, a user audio record control is generally performed by user manually. Namely, the audio record starts when the user clicks a preset button of start audio record, and the audio record ends when the user clicks a preset button of end audio record. This audio record control needs the user to manually click for many times, the operation is complicated and the user experience is influenced.
Therefore, a method of an automatic audio record control occurs in the prior art. In the method, a speech assessment system automatically detects whether the state of the user audio record is a pronunciation or mute, and determines the end of the audio record when a user mute duration is more than a preset time threshold. However, in the method of the automatic audio record control, a normal pronunciation pause of the user may be judged as an endpoint of the audio record if the time threshold is set to be shorter, leading to a user voice truncation. Therefore, in the prior art, it is generally to set the time threshold as a larger value, for example 2 seconds or longer, thereby the user needs to wait for a very long time to identify the endpoint of the audio record and end the audio record by the speech assessment system after the user finishes the pronunciation. In this way, the efficiency for identifying the endpoint of the audio record by the speech assessment system is reduced, the efficiency of the speech assessment is decreased and the user experience is influenced.