This section provides background information related to the present disclosure which is not necessarily prior art.
As speech service technology matures and progresses, many applications provide the functionality of speech service. A speech service system typically includes a terminal and a server.
In speech recognition, for example, a terminal sends a server a speech request which carries speech data to be recognized. Correspondingly, after recognizing the speech data, the server feeds back a speech response carrying a recognition result to the terminal. In order to shorten the response time of the server for the speech request, streaming is a desirable mode for speech transmission. Through streaming, the transmission and recognition of a speech stream is not completed by one speech request, but by dividing the entire speech stream into a number of pieces of speech data segments according to certain rules, and when the user talks, the terminal, at the same time, begins to send the speech requests carrying the speech data segments one by one to the server for speech recognition. Thus, multiplexing is performed for talk time and time for transmitting the speech between the terminal and the server, i.e. when the user begins to talk, the server begins to perform speech recognition, thus the response time of the server for the speech requests is significantly shortened.