(1). Field of the Invention
The present invention relates to the field of PC based conferencing systems. More particularly, the present invention relates to scalable audio compression and multi-point bridging.
(2). Background of the Invention
Historically, low bit rate digital audio transmission (below 16 Kbps) has required the use of vocoder modeling algorithms that implement speech compression only. Vocoders model individual vocal tracts. A limitation to this approach is that only a single voice can be modeled at one time. The vocal tract model is not conducive to other audio signal types such as music or multiple speakers. Higher bit rates could support PCM or ADPCM waveform sampling techniques that preserve the entire waveform at a cost of very high bit rates (32 Kbps for speech to 700 Kbps for CD/Audio). Several algorithms exist in the literature that are aimed specifically at low bit rate (low bandwidth) audio or CD/Audio compression only. However, there has been no approach suggested, which could scale from low bit rates for narrow band audio up to higher bit rates for high quality CD/audio, while still maintaining the total bit rate below ADPCM capabilities.
With the advent of desktop computer video conferencing, there is now a need to provide digital audio bridging which supports a variety of audio capabilities ranging from POTS ("plain old telephone system") audio (3.2 KHz) over standard modem lines (up to 28.8 Kbps) to CD/Audio (20 KHz) over ISDN (Integrated Systems Digital Network lines with a 64 to 128 Kbps data rate). Typical audio bridge circuits today are analog only and cannot support modems or other digital data transmission. Existing digital bridges are very complex and can typically only handle POTS speech bandwidths (some can handle AM radio bandwidths at 7 KHz). None of these bridges can deal with a variety of audio bandwidths simultaneously based on different bandwidths and quality levels available to different users (based on their "classes of service"). This requires all participants to use the "lowest common denominator" or the lowest shared class of service regardless of the capability available to the individual users.
Existing digital bridges using vocoders to digitize speech suffer from several costly problems. Since vocoders model individual vocal tracts, multiple voice signals entering an audio bridge must first be decoded back to PCM samples. Then a composite signal is formed in the PCM domain that is used to create the joint conversation fed back to each of the participants (minus their own signal). Each unique joint conversation must then be re-encoded before transmission back to each participant. This requires a separate decompression and compression (codec) unit for each participant resulting in a costly equipment implementation. Furthermore, since each codec can only model a single vocal tract, if there is background noise or multiple speakers, the quality of the re-compression will be reduced, in some cases substantially. Thus the second cost is in quality loss.
The method and apparatus for establishing a combination of scalable audio compression algorithms, defining communication protocols, and selecting compression to implement a low cost digital audio bridge that permits each user to maintain their highest class of service is desirable.