Speech recognition is a technique for recognizing user speech and executing desired tasks as an alternative to manual operation or character input. Previously, mainly used was a client-based speech recognition, which executes a recognition process by utilizing a resource of a local device. Recently, due to increase in network bandwidth and the advent of distributed processing techniques, it has become common to use a server-based speech recognition for recognizing user speech. Server-based speech recognition involves sending user speech inputted from a microphone embedded on a local device (or features extracted from the user speech) to an external server connected via network and executing a part of the recognition process by utilizing a resource of the external server.
The client-based and the server-based speech recognitions have contrasting features. The client-based speech recognition, which is not connected to the external server, has a quick response, but has a difficulty in handling large recognition vocabulary because of the limited resource of the local device. On the other hand, the server-based speech recognition, which is connected to the external server with high computing power, is able to handle a large recognition vocabulary, but has slower response because of the communication with the external server.
In this way, it is preferable to switch between the client-based and the server-based speech recognitions based on the purpose of the speech recognition. Conventionally, the user needs to switch between the client-based and the server-based speech recognitions by pushing a button of a remote controller. Accordingly, it forces the user to be clearly conscious of the switching between two different speech recognitions.