In order that people can use various services provided by the terminal equipment fast and conveniently without pressing the keys in a particular scene, the voice control technology arises accordingly. People only need to speak out various instructions near a microphone of the terminal equipment, and the terminal equipment can perform the corresponding processing according to the instructions. By taking voice dialing as an example, in order that people who cannot press keys with hands being occupied (for example when driving a vehicle) or who do not have sound upper limbs can also dial the phone, this technology can recognize the information required by dialing the phone from the user's voice, and perform dialing according to the recognized information. People only need to input a voice instruction, for example “dial Zhang San's mobile phone”, to the microphone of the terminal equipment (including a fixed terminal or a mobile terminal), the terminal equipment can establish a call between the user and the called person, thereby significantly simplifying the users' operation. In addition to this application of voice dialing, the voice control technology is also widely used in various products such as robots and garage with a voice controlling switch.
The basic principle of the voice control technology will be introduced below by taking voice dialing as an example.
The terminal equipment firstly generates a syntax packet according to various contact information contained in the Contacts, for example name, address, contact way, etc, and this syntax packet contains voice data of the above contact information; then the terminal equipment receives, via an audio signal receiving interface such as a microphone, a voice signal input by the user, and recognizes the voice according to the received voice signal and the generated syntax packet, and judges whether the voice data of each word in the received voice signal is stored in the syntax packet; if yes, it is considered that this word is recognized from the received voice signal. When the proportion of the words recognized from the received voice signal in all words contained in the received voice signal exceeds a certain predetermined threshold, it is determined that the received voice signal is successfully recognized, and the corresponding subsequent processing is executed. For example, if the terminal equipment regulates that recognition success is determined when 60% of the words can be successfully recognized, and the voice input by the user is “dial Zhang San's mobile phone”, then it is considered that recognition is successful if the terminal equipment can recognize the syllables of more than four words among them (7*60%=4.2), and the subsequent dialing flow is executed; otherwise it is considered that recognition fails, and processing is ended.
In order that the corresponding dialing processing can be performed efficiently according to the recognized information after the voice is successfully recognized, it is generally regulated in advance that when a voice is recognized, the threshold of the proportion of the recognized words in the total words contained in the received voice signal when judging whether voice recognition is successful or not is relatively high. However, in reality, many reasons will lead to the result that the proportion of the recognized words can hardly reach the predetermined threshold, thus causing failure of voice recognition, and hence causing the processing to end, for example the user unconsciously inputs a long paragraph, which only includes a few words that are associated with the dialing action, which will generally cause failure of recognition and ending of the processing since the proportion of the words that can be recognized can not reach the predetermined threshold; for another example, the terminal equipment can only recognize few words due to the accent of the user, which will also cause ending of the processing since the proportion of the words that can be recognized can not reach the predetermined threshold. Therefore, the success rate of the existing voice control technology is very low.