The rapidly growing market for enhanced phone services such as voice dialing is projected to increase rapidly. Hands-free, voice-activated dialing has become particularly important for cellular phone callers.
Some mobile phone manufacturers have embedded speech-recognition technology into their phones, allowing users to program phone numbers that are matched with a spoken name. While in a programming mode, a user may create voice dial entries, for example, by speaking and recording a name or series of digits one to three times.
Some telecommunication providers offer network-based, voice-activated dialers and administer large address books of names and numbers. Because the information is on the network, users may update that information via the web and through their contact management software.
With a typical voice-activated dialer, phone users speak a phone number or contact name into the speaker of the phone to place a call. Speech-recognition software applications, which are embedded in the phone or provided on a phone service network, use automatic speech recognition (ASR) technology, also known as voice recognition. ASR systems enable a digital signal processor or central processing unit (CPU) to recognize the human voice and employ a speech-recognition vocabulary consisting of a set of utterances that a digital recognizer can identify. A most basic vocabulary might be, for example, words such as yes, no, and digits zero through nine.
In recent years, dialing services have been introduced where phone callers are able to dial the pound sign and a three-digit number on a mobile phone, to say the name of the person they want to reach such as “Mom”, and to be connected to that person by the speech-recognition software. Via simple voice commands, callers can access and dial up phone numbers that they have programmed in a locally stored phone number directory by either voice or keypad input.
Speech-enabled applications offer enhanced services designed to increase productivity, efficiency and responsiveness. Some telephone services require callers to dial more digits, and thus, voice dialing is becoming a time-saving feature. New area codes are being added to many metropolitan areas, forcing callers to dial eleven digits for local calls. International calling, which continues to become more common, requires long dialing strings with access codes, country codes and, in many cases, newly lengthened city codes.
In addition to the field of mobile phone services, voice-recognition technologies have been applied to other fields such as database searches, interactive television, and computer interfaces.
Speech-enabled technologies are being explored in the field of database searches. Davallou discloses a phonetic self-improving search engine and a related method for searching databases that employ synthetic phonetic lexicons in “Phonetic Self-Improving Search Engine”, U.S. patent application No. 2002/0156776 published Oct. 24, 2002. After an initial query in a primary database fails, an error database of records of previously failed searches is queried with a search string to obtain a positive result, and if no record is found still, the search string is parsed into one or more pronounceable units. Phonetically equivalent formulas are applied to one or more pronounceable units to create one or more search strings, which then are re-queried into the error memory database and the primary database. A phonetic database with phonetic equivalent formulas and their respective pronounceable units helps find possible matches with database records.
Voice-recognition software has been used with interactive televisions. An interactive, speech-enabled television system that enables a user to select channels by spoken request is disclosed in “Automatic Search of Audio Channels by Matching Viewer-Spoken Words Against Closed-Caption/Audio Content for Interactive Television” by Boman, et al., U.S. Pat. No. 6,480,819 issued Nov. 12, 2002. The system includes a semantic analyzer that is able to discriminate between speech intended to describe program content and speech intended to supply meta-commands to the system. By extracting meaning as well as keywords and phrases from the spoken input, the system finds matching content when the spoken words do not match the closed caption text exactly.
In the field of semantic computer interfaces, a method and a speech-enabled interactive computer system through which a user specifies program content is disclosed by Beauregard, et al. in “Semantic User Interface”, U.S. patent application No. 2002/0156774 published Oct. 24, 2002. Commands of the semantic interface, which may be natural language-based or user-defined, allow a user to launch applications and navigate within applications by using language rather than clicks from a pointing device such as a mouse. The system extracts both keyword and semantic content from the speech of the user, prompting the user to furnish additional information when the meaning is unclear.
A number of current voice-dialing services allow callers to speak a name or phone number clearly into a phone and the system will dial based on matches to a first name, last name, full name, business name or phone number. Unfortunately, voice-recognition software is not always able to decipher voice input, particularly in the case of phone numbers. Voice recognition systems typically look for pattern matches for each individual digit in a telephone number when a user dials via voice, which can lead to one or more digits being misrecognized and can result in the potential for a misdialed number or a need to restart the voice dialing, even though the rest of the number is correct. Digits may be misrecognized due to several reasons. Systems may not be able to distinguish between, for example, the words one and nine. They may have difficulty translating, for example, the words nine thousand into the four digits of 9-0-0-0. Users may forget to give an area code. The voice input may be garbled, or background noises may be interpreted as numbers. Users may have different accents or nuances with specific numbers that a speech-recognition system has difficulty in recognizing accurately.
The above-mentioned difficulties with voice dialers for phones demonstrate the need for an improved method of operating a voice-activated cellular phone or other communication device that can better recognize phone numbers and match them with stored information when the voice input is incorrect, incomplete, or mistranslated. Accordingly, a beneficial voice-based dialing method and system would be more accurate, providing more correct matches, and when there is a voice-input error, offering alternative phone numbers that are logical to the phone user. It is an object of this invention, therefore, to provide a method of and system for operating a voice-activated communication device that places phone calls, having improvements that overcome the deficiencies and obstacles described above.