In a speech recognition system of the server-client type, how to arrange a dictionary for speech recognition is an important aspect in design. Considering that an engine performing speech recognition is provided to a server, it is reasonable that a dictionary for speech recognition is provided to the server which is easily accessible from the engine. This is because in a network line connecting a client terminal device (hereinafter referred to as a “client”) and a server, data transferring speed is generally lower and costs required for communications are generally higher compared with a data bus which is a data transmission path inside the server.
On the other hand, there is a case where it is desirable to change vocabulary for speech recognition by each client, such as words which are uniquely used by a client. In such a case, it is convenient for management to store a dictionary for speech recognition including words uniquely used by a client on the client side. As such, in a speech recognition system of the server-client type, speech recognition processing is generally proceeded using both a dictionary for speech recognition provided to the server and a dictionary for speech recognition provided to the client. An example of a system for performing speech recognition processing using both a dictionary for speech recognition provided to a server and a dictionary for speech recognition provided to a client has been proposed (see Patent Document 1).
A speech recognition system shown in FIG. 8 includes a client 100 having a speech recognition engine 104 and a recognition dictionary 103, and a server 110 having a speech recognition engine 114 and a recognition dictionary 113. This speech recognition system generally operates as follows. When a speech is input from a speech input section 102, the client 100 refers to the recognition dictionary 103 controlled by a dictionary control section 106 and performs speech recognition processing by the speech recognition engine 104. When the speech recognition processing is performed successfully and a speech recognition result is obtained, the speech recognition result is output via a result integration section 107.
In contrast, when the speech recognition processing is performed unsuccessfully and a speech recognition result is rejected, the client 100 transmits the input speech data to the server 110 by a speech transmission section 105. The server 100 receives the speech data by a speech reception section 112, refers to the recognition dictionary 113 controlled by a dictionary control section 115, and performs speech recognition processing by the speech recognition engine 114. The obtained speech recognition result is transmitted to the client 110 by a result transmission section 116, and is output via the recognition integration section 107.
In summary, if a speech recognition result is obtained by the client itself, the result is used as an output of the speech recognition system, and if a speech recognition result cannot be obtained, the server performed speech recognition processing and a speech recognition result thereof is used as an output of the speech recognition system.
Another example of a system for performing speech recognition processing using a dictionary for speech recognition provided to a server and a dictionary for speech recognition provided to a client has also been proposed (see Patent Document 2). A speech recognition system shown in FIG. 9 includes a client 200 having a storage section 204 storing a user dictionary 240A, speech recognition data 204B, and dictionary management information 204C, and a server 210 having a recognition dictionary 215 and a speech recognition section 214. The client 200 and the server 210 are adapted to perform communications with each other via a communication section 202 of the client 200 side and a communication section 211 of the server side.
This speech recognition system generally operates as follows. Prior to speech recognition processing, the client 200 transmits the user dictionary 204A to the server 210 by the communication section 202. Then, the client 200 transmits the speech data input from a speech input section 201 to the server 210 by the communication section 202. The server 210 performs speech recognition processing by the speech recognition section 214 using the user dictionary 204 received by the communication section 211 and the recognition dictionary 215 managed by a dictionary management section 212.    Patent Document 1: Japanese Patent Laid-Open Publication No. 2003-295893    Patent Document 2: Japanese Patent No. 3581648