Speech recognition technology has been available for over twenty years. Over the past years speech recognition technology has dramatically improved, particularly with speech dictation or “speech to text” systems such as those marketed by International Business Machines Corporation (IBM) and Dragon systems. An example for such a voice recognition and “text to speech” software product is the IBM Via Voice program which runs on a standard personal computer, such as under the windows operating system or other operating systems.
On the other hand the more universal application of speech recognition is input to computers and other electronic control systems for example to input commands or to control a process or machine. For example, a user may navigate through a computer's graphical user interface by the user speaking the commands which are customarily found in the systems menu text, icons, labels, buttons, etc.
The latter aspect is of particular importance for controlling portable devices, such as mobile phones, personal digital assistants or palm top computers.
A further important field of application is in the automotive field. For example a car radio can be equipped with speech recognition such that the driver can select a radio station by means of speech control. As a further example in the field of automotive control commands for switching the lights on, activating the window opener or similar functions can be inputted into the automotive control system by means of natural voice.
With the advent of internet and multimedia applications and the integration of entertainment and communication into the car electronic this field of speech recognition is becoming more important.
U.S. Pat. No. 5,602,963 shows a handheld electronic personal organizer which performs voice recognition on words spoken by a user to input data into the organizer and which records voice messages from the user. The spoken words and the voice messages are input via a microphone. The voice messages are compressed before being converted into digital signals for storage. The stored digital voice messages are reconverted into analog signals and then expanded for reproduction using a speaker. The organizer is capable of a number of a different functions, including voice training, memo record, reminder, manual reminder, timer setting, message review, waiting message, calendar, phone group select, number retrieval, add phone number, security, and “no” logic. During such various functions, data is principally entered by voice and occasionally through use of a limited keypad, and voice recordings are made and played back as appropriate.
U.S. Pat. No. 5,706,399 discloses a speech controlled vehicle alarm system. The system allows control of alarm functions to be accomplished using specific spoken commands. A microphone converts speech into time-variant voltage levels which are amplified and sent to a analog-to-digital converter and digitised. The digitised data is then processed by a speech recognition subsystem.
The speech recognition subsystem separates extraneous speech from words and provides corresponding output signals when control words are recognized. The output signals are employed by the alarm system to operate door locking and unlocking controls, to operate a loud audible siren and/or horn, to operate vehicle light controls, to provide engine cut-off control, to provide engine starting control or to operate a response indicator incorporated in the main alarm processing unit. The response indicator provides verbal responses to confirm spoken commands.
U.S. Pat. No. 5,745,874 shows a pre-processor for automatic speech recognition. The pre-processor is based upon auditory modelling and includes a tapped delay line and a neural network in the form of a multilayer perceptron. The tapped delay line receives an analog speech signal and provides multiple time delayed samples thereof in parallel as inputs for the neural network. The single analog output of the neural network is suitable for interfacing with a signal processor for further processing of the speech information using spectral signal analysis so as to provide a speech representation with desirable characteristics of an auditory based spectral analysis model while simultaneously maintaining a standard analog signal interface.
U.S. Pat. No. 5,960,394 discloses a method of speech command recognition for converting spoken utterances into either text or commands. The system runs on a platform capable of running a plurality applications. Text and commands are sent from a word recognition application to one or more user applications. In addition, information pertaining to the state of the user applications is sent back to the word recognition application. Word recognition probabilities are modified based the information received from the user applications.
U.S. Pat. No. 6,192,343 shows a speech command input recognition system which interprets speech queries such as help queries and presents a list of relevant proposed commands sorted in order based upon relevance of the commands. The system also provides for adding terms to previous speech terms.
A common shortcoming of the prior art speech processing system is the required processing power and memory access bandwidth. The processing power provided by standard microprocessors, such as the Intel Pentium processor family, is sufficient for speech processing applications.
However the parallel execution of other application programs in the personal computer can be considerably slowed down when a speech recognition system is used as the speech recognition system requires a substantial amount of the available processing capacity. At the same time this may imply that the speech recognition system does not perform to the satisfaction of the user.
On the other hand there is a variety of applications where the usage of such high performance standard microprocessors is not desirable for a number of reasons. Firstly, the price of adding an additional processor for the speech recognition can be unacceptably high. Secondly, in the case of portable electronic devices the power consumption of an additional high performance processor can drastically reduce battery lifetime.
It is therefore an object of the present invention to provide an improved method and processor system for the processing of audio signals.