1. Field of the Invention
The present invention relates generally to the field of conferencing, and more particularly to a method and apparatus for wideband conferencing.
2. Description of the Background Art
Speakerphones and telephones are telecommunications devices used for a variety of purposes, typically for telephonic communications between two or more endpoints and two or more parties. Remote teleconferencing serves a valuable purpose, and often enables increased productivity between, or within, organizations without raising associated costs incurred due to travel and time constraints.
Telecommunications devices (e.g., speakerphones, etc.) of the prior art are typically used to transmit analog, voice-based communications at frequencies below 3.3 kiloHertz (kHz), and typically work over two different types of telecommunications networks. The first type of telecommunications network comprises the Plain Old Telephone System (POTS) and the Public Switched Telephone Network (PSTN). The second type of network relies upon network information technologies such as the Internet, a Local Area Network (LAN), a Wide Area Network (WAN) or a Virtual Private Network (VPN) to transmit voice signals as data packets. However, each of these types of networks suffers from limitations unique to their respective type of network.
Over a PSTN/POTS network, a telecommunications device routes calls between specialized computers known as switches. A call signal is sent through a Private Branch Exchange (PBX), which addresses and connects calls to a destination PBX and, ultimately, to a receiving telecommunications device.
Referring to FIG. 1, a POTS/PSTN conferencing system 100 is shown. An initiating telecommunications device 102 sends a calling signal to a PBX 104, which in turn routes the call to a switch 106. The switch 106, subsequently, routes the call to a receiving switch 108 over a PSTN no. The call is then routed from the receiving switch 108 through a destination PBX 112 to a receiving telecommunications device 114. In analog mode, the telecommunications devices 102 and 114 may train and synchronize, adjusting for line conditions such as amplitude response, delay distortions, timing recovery, and echo characteristics. However, conventional phones and speakerphones do very little training and synchronizing, so the amount of training and synchronization can be anywhere to none, which still yields a working link (99.9% of phones do it this way) to other telecommunications devices. It should be noted that the PBX is optional. In this embodiment, the telecommunications device may be connected directly to the switch 106. This is the typical connection from a user's home.
A primary source of quality degradation with telecommunications devices occurs as a result of network infrastructure characteristics such as frequency handling and available bandwidth. The conferencing system too operating over a PSTN/POTS network is typically limited by both available frequency and a narrow range of bandwidth. These network characteristics limit type, form, and amount of data shared between telecommunications devices 102 and 114. Conventional narrow bandwidth systems further limit audio quality: audio bandwidth, audio noise level, audio path gain, etc.
Conventional telecommunications devices are typically designed to filter out frequencies above 3.3 kHz. However, the filtering of frequencies between 3.3 kHz and 7 kHz significantly reduces sound quality, clarity, and distinction. The fact that conventional telecommunications devices and networks filter frequencies above 3.3 kHz limits intelligibility of speech and other sounds, because much of the content that the ear depends on is carried in these higher frequencies. This results in connections that sound hollow, muddled, muffled, or distorted. The use of analog pathways also introduces significant variations in gain, thus one connection will be much quieter than another. This results in a connection that is difficult to hear. The use of analog pathways also often introduces significant noise, resulting in connections that are difficult to understand. The combination of all these degradations, which is common on conventional phone lines, results in poor and variable call quality. These problems are exacerbated when participants in a conference are in sub-optimal environments, such as reverberant rooms, when there are multiple participants who may interrupt one another, or when participants do not all share a common native language or dialect, resulting in accented speech that can be difficult to understand in the best of conditions, and impossible over a phone connection. Another problem often associated with conventional telecommunications devices includes the transmission of background noise and static over a PSTN call. Thus, current telecommunications devices often provide poor Quality of Service (QoS) and lack enhanced features and functionality.
Alternatively, the second type of telecommunications network may also rely upon PSTN, but transmits signals over digital networks using packet switching. This treatment results in a technique known as Voice over IP (VoIP) or simply IP. VoIP devices forward sound and other data as packets of information over digital networks using standards including ITU H.323, MGCP, SIP, etc. Using an embedded modem and codec, a VoIP device encodes voice (and/or other sounds) as data packets that are switched between network-addressed servers. The network-addressed servers process, reassemble, and convert digital signals to analog signals at a receiving VoIP device.
Referring to FIG. 2, an exemplary IP conferencing system 200 of the prior art is shown. An IP device 202 encodes sound from analog signals into digital data packets using a codec 204. Using a modem 206 or similar device, the system 200 switches data packets through a home (or other type of) network 208 such as a LAN, WAN, etc. One example of a communications device is an RS-232-C modem. The system 200 then transmits the data packets through a firewall 210, if one is present, to an access server 212. The access server 212 enables the data packets to be switched onto an Internet backbone 214.
Next, the system 200 forwards the data packets through a destination access server 216 and a destination firewall 218, if one is present, to a receiving home (or other type of) network 220. The data packets are re-modulated via a modem 222 or similar device, and encoded into analog voice-based signals using a codec 224. The modem 222 and codec 224 are embedded in a receiving VoIP device 226, in one example.
Although the system 200 enables VoIP, there are significant problems associated with this type of conferencing over an IP network. One problem of conventional PSTN and IP-capable telecommunications devices is an inability to conduct effective simultaneous sound (e.g., voice, etc.) and data conferencing due to bandwidth limitations and lower frequency ranges. Further, significant problems with conferencing over a VoIP network are that expanded services such as wideband audio or side-channel data cannot be shared with non-VoIP devices, which constitute the great majority of endpoints in the world.
Another limitation of VoIP telecommunications devices results from time delays incurred from data packet re-assembly. The delay in re-assembly results in broken and unnatural speech, greatly reducing the quality of the conference call. Still another limitation with IP conferencing is data vulnerability to external breaches of security. Although data encryption can be implemented using means such as public and private session keys, bandwidth restrictions impose a significant burden upon telecommunications devices and substantially affect QoS.
Multimedia conferencing represents a substantial improvement over voice conferencing. However, current telecommunications devices are incapable of overcoming existing network limitations. Furthermore, current telecommunications devices cannot effectually bridge or manage multiple calls. The limited frequency range of 3.3 kHz prohibits multiple call functions from being simultaneously performed including bridging or data exchange. Current telecomm devices can bridge and manage multiple calls, but their usability is degraded due to the fact that the available bandwidth is much less than the bandwidth of human speech. Further, this issue becomes more critical as more people are in the conference. Understandability, more sources of noise, increases difficulty in identifying the talker, for example. As a separate but significant issue in modern teleconferencing, there is very limited ability to communicate side-channel data (such as sending dial-additional-call commands, requesting cost-of-conference-so-far status information, and so forth) between the participants or to a bridging device. The most common technique is to use DTMF tones, which is slow and disruptive.
Yet another limitation of prior art narrowband conferencing systems is their ineffectiveness in handling multiple simultaneous speakers (or other sources of sound). A further problematic limitation of prior art narrowband speakerphones is call degradation due to the exclusionary filtering of signals above 3.3 kHz. Thus, conventional telecommunications devices have severe limitations related to QoS, data security, bridging, and advanced communications functions. Therefore, there is a need for a new and innovative method and apparatus for wideband audio conferencing using existing infrastructures to deliver enhanced services.