Conventionally, there has been an information processing system enabling users to exchange information by voice through a communication line. FIG. 11 is a flowchart showing voice recognition/information processing operation by a conventional voice-input information processing system. FIG. 11 shows that a voice waveform is inputted in a user terminal side in the step S1. The inputted voice waveform data is transmitted to a center system side through a communication line in the step S2. Eventually, waveform analysis is conducted in the center system side in the S3. Then, there are performed phoneme recognition in the step S4, word recognition in the step S5, and sentence recognition in the step S6. Thus, in accordance with a voice-inputted sentence obtained as a result of language processing, application program is executed in the step S7.
In the conventional voice-input information processing system as described above, voice waveform data is transmitted to the center system side through a communication line, which causes distortion of a user's voice, thereby making voice recognition on the center system side difficult. Further, using unspecified speaker voice recognition to support a number of users generates a group of speakers low in recognition capability with a certain probability.
In order to solve the above-stated problem, there is a voice-input information processing system (e.g., Japanese Patent Laid-Open Publication HEI No. 8-6589) provided with a specified speaker voice recognition function or a speaker-adapted voice recognition function on the user terminal side, in which lexical grammar data necessary for recognition is transmitted from the center system side to the user terminal side through a communication line for performing voice recognition. FIG. 12 is a flowchart showing voice recognition/information processing operation by such voice-input information processing system.
In the step S11, lexical grammar data communication is carried out between the user terminal side and the center system side, by which lexical grammar data necessary for recognition is transmitted from the center system side to the user terminal side. A voice waveform is inputted in the user terminal side in the step S12. In the step S13, waveform analysis is conducted. There are performed phoneme recognition for speaker adaptation in the step S14, word recognition in the step S15, and sentence recognition and transmission of a recognition result to the center system side in the step S16. In accordance with a voice-inputted sentence obtained on the user terminal side, application program is executed on the center system side in the step S17.
However, the-above stated conventional voice-input information processing system provided by voice recognition function on the user terminal side suffers a following problem. More particularly, the voice-input information processing system is capable of implementing high voice recognition capability. However, every time application software is changed, lexical grammar data corresponding to respective application should be transmitted from the center system side to the user terminal side through a communication line, which causes a problem that annoying waiting time is generated for information transfer at the time of changing the application in the case where transmission speed of the communication line is slow compared to data quantity of the lexical grammar data.
Further, with the number of a lexis being over several thousands, it is necessary to increase the processing speed of a processor necessary for real-time processing, which generates a problem in terms of power consumption if the user terminal is a mobile device such as cell phones and PDA (personal digital assist).