The present invention relates generally to digital voice conferencing systems which provide digitally encoded voice communication to remotely located digital voice terminals, and more specifically to a system for facilitating multispeaker conferencing on vocoders which use narrowband channels (such as 4.8 kb/second). The present invention also provides a system that allows digital-to-digital conversion between vocoders which output digital data streams at different bit rates.
A vocoder (voice operated coder) is a device used to enable people to participate in private communication conferences over ordinary telephone lines by encoding their speech for transmission, and decoding speech for reception. The vocoder unit consists of an electronic speech analyzer which converts the speech waveform to several simultaneous electronic digital signals, and an electronic speech synthesizer which produces artificial sounds in accordance with the encoded electronic digital signals received.
The problem of conferencing over systems which employ parametric vocoders has long been of interest. In analog or wideband digital conferencing, overlapping speakers are handled by signal summation at a conferencing bridge. Such a scheme is not feasible for parametric vocoders for two reasons: 1) signal summation would require tandeming, synthesis and reanalysis of the speech waveform, a process which causes severe degradations in narrowband parametric vocoders; 2) narrowband vocoders cannot satisfactorily represent multiple speakers. One of the difficulties in combining two or more voice tracks is that you end up with two fundamental frequencies: one for each voice signal. These are difficult to encode and separate.
One narrowband technique currently in use is based on the idea of signal selection: a speaker has the channel until he finishes or someone with a higher priority bumps him, and speakers vie for the open channel when it becomes available. The advantage of such a technique is that it avoids the degradations described above; however, such a technique is cumbersome since most conference control is handled by interruptions and overlapping speakers, and this scheme presents only one speaker to the listener. Some coders have some capability of representing multiple speakers; however, the speech quality is significantly degraded due to the tandem between coders. In other schemes two-speaker overlaps can be accomplished by permanently halving the available bandwidth alloted to each coder and deferring signal summation to the terminal. This scheme limits the overall quality of the conference by forcing the coder to work at half the available bandwidth. Since, for the majority of a conference, there will be only a single speaker, this technique causes a degradation in perceived quality.
Examples of vocoder system technology are discussed in the following U.S. Patents, the disclosures of which are incorporated herein by reference:
U.S. Pat. No. 4,856,068 issued to Quatieri, Jr. et al; PA1 U.S. Pat. No. 4,885,790 issued to McAulay et al; PA1 U.S. Pat. No. 4,270,025 issued to Alsup et al; PA1 U.S. Pat. No. 4,271,502 issued to Goutmann et al; PA1 U.S. Pat. No. 4,435,832 issued to Asada et al; PA1 U.S. Pat. No. 4,441,201 issued to Henderson et al; PA1 U.S. Pat. No. 4,618,982 issued to Horvath et al; and PA1 U.S. Pat. No. 4,937,873 issued to McAulay et al.
All of the above-cited patents disclose digital vocoders and voice compression systems that can be improved by the present invention. Of particular interest are the Quatieri, Jr. et al and McAulay et al references which disclose vocoder systems with equipment used by the present invention.
The inherent problems encountered by all of the prior art vocoder systems is a result of the difficulty in realistically representing human speech in limited narrowband channels. As pointed out in the Goutmann et al reference, current digital voice terminals currently achieve bit rates ranging between 2.4 to 32 kilobits per second. One of the most common systems uses 4.8 kb/second. When a user of a system that uses only a 4.8 kb/second data stream is attempting to communicate with a user of a 2.4 kb/second data stream, a means of converting the bit rate signals becomes necessary. The present invention provides examples with specific kb/second ranges, however it should be understood that this invention is not limited to these specific data bit rates. Although the example includes the use of modems and telephone lines, the invention is applicable over any media that transmits digitally encoded voice signals. These media systems include, but are not limited to: radio transmissions, satellite communication systems and laser communication systems.
In view of the foregoing discussion, it is apparent that there remains an ongoing need to enhance the ability of digital vocoders to handle mutlispeaker conferencing on narrowband channels, and to interface vocoders which have different bit rate data streams while preserving voice quality. The present invention is intended to help satisfy that need.