Wireless communication devices are used for a variety of different functions and to provide a communication platform for a user. One particular wireless communication device is a headset. Generally, headsets incorporate speakers that convey audio signals to the wearer, for the wearer to hear, and also incorporate microphones to capture speech from the wearer. Such audio and speech signals are generally converted to electrical signals and processed to be wirelessly transmitted or received.
Wireless headsets have become somewhat commonplace. Wireless headsets are generally wirelessly coupled with other devices such as cell phones, computers, stereos, and other devices that process audio signals. In use, a wireless headset may be coupled with other equipment utilizing various RF communication protocols, such as the IEEE 802.11 standard for wireless communication. Other wireless communication protocols have been more recently developed, such as the Bluetooth protocol.
Bluetooth is a low-cost, low-power, short-range radio technology designed specifically as a cable replacement to connect devices, such as headsets, mobile phone handsets, and computers or other terminal equipment together. One particular use of the Bluetooth protocol is to provide a communication protocol between a mobile phone handset and an earpiece or headpiece. The Bluetooth protocol is a well; known protocol understood by a person of ordinary skill in the art, and thus all of the particulars are not set forth herein.
While wireless headsets are utilized for wireless telephone communications, their use is also desirable for other voice or audio applications. For example, wireless headsets may play a particular role in speech recognition technology. U.S. patent application Ser. No. 10/671,140, entitled “Wireless Headset for Use in a Speech Recognition Environment,” and filed on Sep. 25, 2003, sets forth one possible use for a wireless headset and that application is incorporated herein by reference in its entirety. Speech recognition applications demand high quality speech or audio signal, and thus a significantly robust communication protocol. While Bluetooth provides an effective means for transmission of voice for typical telephony applications, the current Bluetooth standard has limitations that make it significantly less effective for speech recognition applications and systems.
For example, the most frequently used standard representing voice or speech data in the telephony industry utilizes 8-bit data digitized at an 8,000 Hz sample rate. This communication standard has generally evolved from the early days of analog telephony when it was generally accepted that a frequency range of 250 Hz to 4,000 Hz was adequate for voice communication over a telephone. More recent digital voice protocol standards, including the Bluetooth protocol, have built upon this legacy. In order to achieve an upper bandwidth limit of 4,000 Hz, a minimal sample rate of at least twice that, or 8,000 Hz, is required. To minimize link bandwidth, voice samples are encoded as 8 bits per sample and employ a non-linear transfer function to provide increased dynamic range on the order of 64-72 dB. The Bluetooth standard supports generally the most common telephony encoding schemes. At the physical layer, the Bluetooth protocol uses a “synchronous connection oriented” (SCO) link to transfer voice data. An SCO link sends data at fixed, periodic intervals. The data rate of an SCO link is fixed at 64,000 bits per second (64 Kbps). Voice packets transmitted over an SCO link do not employ flow control and are not retransmitted. Therefore, some packets are dropped during normal operation, thus resulting in data loss of portions of the audio signals.
For most human-to-human communication applications, such as telephony applications, the current Bluetooth voice sampling and encoding techniques using SCO links and voice packets are adequate. Generally, humans have the ability to subconsciously use reasoning, context, and other clues to mentally reconstruct the original speech over a more lossy communication medium. Furthermore, where necessary, additional mechanisms, such as the phonetic alphabet, can be employed to ensure the reliability of the information transferred (e.g., “Z” as in Zulu).
However, for human-to-machine communication, such as speech recognition systems, significantly better speech sampling and encoding performance is necessary. First, a more reliable data link is necessary, because dropped voice packets in the typical telephony Bluetooth protocol can significantly reduce the performance of a speech recognition system. For example, each dropped Bluetooth SCO packet can result in a loss of 3.75 milliseconds of speech. This can drastically increase the probability of a speech recognition error.
Additionally, the information-bearing frequency range of speech is now understood to be in the range of 250 Hz to 6,000 Hz, with additional less critical content available up to 10,000 Hz. The intelligibility of consonants has been shown to diminish when the higher frequencies are filtered out of the speech signal. Therefore, it is important to preserve this high end of the spectrum.
However, increasing the sample rate of the audio signal to 12,000 Hz, while still maintaining 8-bit encoding exceeds the capability of the Bluetooth SCO link, because such an encoding scheme would require a data rate of 96 Kbps, which is above the 64 Kbps Bluetooth SCO rate.
Speech samples digitized as 8-bit data also contain a high degree of quantization error, which has the effect of reducing the signal-to-signal ratio (SNR) of the data fed to the recognition system. Speech signals also exhibit a variable dynamic range across different phonemes and different frequencies. In the frequency ranges where dynamic range is decreased, the effect of quantization error is proportionally increased. A speech system with 8-bit resolution can have up to 20 dB additional quantization error in certain frequency ranges for the “unvoiced” components of the speech signal. Most speech systems reduce the effect of quantization error by increasing the sample size to a minimum of 12 bits per sample. Thus, the current Bluetooth voice protocol for telephony is not adequate for speech application such as speech recognition applications.
Therefore, there is a need for an improved wireless device for use in speech and voice applications. There is particularly a need for a wireless headset device that is suitable for use in speech recognition applications and systems. Still further, it would be desirable to incorporate a Bluetooth protocol in a wireless headset suitable for use with speech recognition systems.