In distributed speech recognition (DSR) systems, the user may control an application on the basis of spoken control messages supplied to an automatic speech recognition (ASR) means or engine. The spoken control messages are converted by the ASR engine into text commands which are sent to the application running in a corresponding network application server (NAS) or to a subscriber terminal like a mobile station (MS) from which the spoken control messages have been received.
The basic function of a distributed speech recognition system in the context of mobile applications is the ability of a mobile station to provide automatic speech recognition features with the help of a high power ASR engine or ASR server provided in the network. Therefore, the basic function of the mobile station is the transmission of an input speech command to this network ASR engine to perform the recognition tasks and return the results. The result can be a recognized word or command in text format. The mobile station can then use the text to perform the necessary functions.
Another function of such a system is to provide the mobile station with access to other application servers, i.e. Internet WWW (World Wide Web), email, voice mail and the like, via speech commands. Therefore, the user with such a type of mobile station is able to connect to these application servers and issue speech commands. To achieve this, the mobile station transmits a speech signal (audio) to the ASR engine. The ASR engine will perform speech recognition so as to obtain corresponding text commands. These text commands are returned to the mobile station. The mobile station then uses these text commands to control a corresponding network application server (NSA) which can be any server in a data network like the Internet that provides various services like WWW, email readers, voice mail and so on.
Since the ASR engine usually runs on a platform that can also run other applications or perform other tasks, it is possible to transfer other functions to the ASR engine, such as processing the obtained text command to ascertain the required operation and contact the relevant server. Then, it transmits the information retrieved from the contacted network application server back to the mobile station. In this situation, the mobile station receives a speech input, sends it to a network ASR engine which performs speech recognition, executes necessary functions based on the speech commands and sends the retrieved information or results to the mobile station.
In the following, examples for the above cases are described: