1. Field of the Invention
The present invention relates generally to the field of telecommunications and, more specifically, to a method and apparatus for providing unified conferencing services in an expandable telecommunications switching system.
2. Background Information
A competitive telecommunications system must be capable of providing a wide variety of telecommunications services. For example, subscribers may request services such as voice processing services, call waiting, caller identification and call forwarding. In the commercial context, one of the most desirable services is that of conferencing. Conferencing refers to the ability of three or more callers, each using a separate telephone set and often located at remote locations from each other, to participate in a single telephone call simultaneously. In addition, there is an ever-expanding need for conferencing services that can accommodate large conferences of, for example, ten to seventy or more participants. Moreover, the participants are often physically distributed worldwide. This means that the pulse code modulation (PCM) format pursuant to which the voice signals are encoded may differ between callers. In other words, some of the calls might be in the well known xcexc-law PCM format commonly used in the United States, and other calls participating in the conference may be encoded in A-law format, used typically in Japan.
A further service that is often desirable in conferencing is that of a broadcast output. There are many commercial applications in which a one-way (half-duplex) operation is desired. For example, there may be a monitoring operator, such as a supervisor, listening in on a subordinate""s telephone call with a customer. Alternatively, educational courses can be provided over the telephone and participants simply listen and do not have the opportunity to speak during this type of conference. The broadcast output supplies this type of one-way connection.
Originally, separate conferencing systems had interfaced with a conventional computer-controlled digital switching matrix within a Private Branch Exchange (PBX) switch or a public switching system that provided a circuit switching function. More recently, it has been known to provide conferencing within a high-speed digital communications network that includes a plurality of switching nodes with each node including its own nodal switch. This type of system is described in commonly-owned U.S. Pat. No. 5,920,546 (Hebert et al.) for a METHOD AND APPARATUS FOR CONFERENCING IN AN EXPANDABLE TELECOMMUNICATIONS SYSTEM, which is presently incorporated herein by reference in its entirety.
In accordance with that system, at least one node in the system (e.g., a conferencing node) contains a digital signal processing (xe2x80x9cDSPxe2x80x9d) circuit capable of performing a conferencing operation on the voice information of the conferees. More specifically, the DSP circuitry executes a conferencing function on the voice information by operating on it using, for example, a conferencing algorithm that typically includes summing together the channels of voice information from each conferee. As is typical in the industry, after summing all of the voice data, the conference processor subtracts each conferee""s data from the summed total intended for that conferee. This is done in order to minimize echo effects and improve system stability. The DSP circuit executes this conferencing function on the voice information and then outputs a different instance of conferenced voice information for each conferee. Each instance of conferenced voice information is then transmitted to the corresponding conferee.
More specifically, the DSP circuitry first places the instances of conferenced voice information on an internal bus located in the conferencing node. A data transmitter in the nodal switch that is preferably linked with the bus then receives the instances of conferenced voice information. Next, the conferencing node may formulate a packet or packets containing the instances of conferenced voice information for transmission via its data transmitter over the network. Specifically, each instance of conferenced voice data may be packetized, addressed and transmitted according to instructions from the system to the programmable switching node interfaced with the corresponding conference participants. Each programmable switching node, upon receipt of the packet or packets, then captures the instance of conferenced voice information ear marked for that participant via its own data receiver and switches the information to the participant.
This process may be repeated on a high-speed basis, however, there is a limit to the number of conferees that can be included in the conference. It can be difficult to form a very large conference with typically-employed conferencing algorithm mathematics because of the noise that is accumulated from every channel, which would then be summed and thus increased per additional participant. In addition, conference participants may tend to speak even louder and louder to overcome this noise and a limit to the loudness tolerated in the audio components of the system may be reached causing the conference output signal to be incomprehensible.
For example, the above-mentioned ""546 patent generally handles conferences of up to about seven conferees. Despite its utility in forming a conference that has a high quality voice signal output, the conferencing technique described above does not fully allow for the further capacity to handle larger conferences. In addition, there is a further need for a conferencing system that can accommodate larger conferences and for such a system that is capable of operating on input signals from participants whose PCM-encoded data is in different formats. This need is particularly great with large conferences because it is even more likely that a conference of thirty or more will include participants from various parts of the world and thus, will involve PCM-encoded data which is in different formats. There also remains a need for a system for large conferences that includes the capability of providing a broadcast output.
It is therefore an object of the present invention to provide a method and apparatus for providing conferencing services for a large number of participants (in full duplex or half-duplex operations) that produces a high quality output signal, while accommodating participants having voice information in different PCM-encoding formats. It is a further object of the present invention to provide a conferencing system that is compatible with a high-speed, expandable telecommunications system. It is a further object of the invention to provide a conferencing system that can implement multiple distinct conferences simultaneously.
Briefly, the invention comprises a method and apparatus for providing conferencing services in a telecommunications system. A preferred embodiment of the invention operates within a high-speed telecommunications system comprised of multiple switching nodes connected by an inter-nodal network. At least one node in the system contains digital signal processing (DSP) circuitry capable of performing conferencing functions on the voice information from conferees connected to the system. Typically, many nodes will contain multiple DSP integrated circuits (xe2x80x9cchipsxe2x80x9d) in a DSP module. A DSP chip contains a microprocessor as well as memory storage devices. The microprocessor is programmed in accordance with the invention to provide the conferencing services. When a conference is being established, available conferencing resources are identified within a DSP chip in a node in the system, which is then the conferencing node. Pursuant to instructions from the system host, the voice information at each node interfaced with a conferee is addressed and transmitted to the conferencing node. The details of the routing of this information to the conferencing node are set forth in the previously incorporated, commonly-owned U.S. Pat. No. 5,920,546. Several conferences can be maintained at a time and conferences are dynamically set up and torn down by the system. Additionally, conferees can be dynamically inserted and removed from each conference.
A data receiver at the conferencing node captures the voice information from the incoming PCM time slots. The captured voice information is stored in an array within a high-speed memory storage device of the DSP chip for further computations.
Information is kept of the channels occurring in different conferences currently being maintained by each DSP chip. In accordance with the invention, the CPU of the DSP chip is programmed to search for the loudest channels, i.e., the greatest energy, for all channels in the array for a particular conference. Only those channels that contain the greatest energy within a certain time period are used by the DSP chip to form the conference.
A determination is previously made about the number of loudest channels to be selected. This number of channels is referred to herein as N. In a preferred embodiment, N set equal to 3. Specifically, the three loudest channels are selected and the voice information from those particular three channels is summed to form an instance of the conferenced information.
In addition, a determination is previously made about the time duration over which to estimate the energy in all channels. This time duration is herein referred to as T_energy. In a preferred embodiment, T_energy is set equal to 15 milliseconds.
A determination is previously made about the time interval over which to survey the individual channels to find those that contain the greatest energy. This preferred time interval is herein referred to T_survey. T_survey is greater than or equal to T_energy. As will be understood by those skilled in the art, a larger T_survey reduces the computational load of the DSP chip per unit time, but can decrease the perceived quality of the conference. In a preferred embodiment, T_survey is approximately equal to T_energy.
To determine those channels that are loudest, the CPU of the DSP chip is programmed to sub-sample the captured voice information, and to estimate the energy in each channel using the sub-sampled data. As will be understood by those skilled in the art, sub-sampling the data reduces the number of computations that must be performed by the CPU of the DSP chip over time, and thereby permits the DSP chip to service more channels at a time. In a preferred embodiment, the data for each channel is sub-sampled by a factor of four, resulting in a sub-sample rate of 0.5 microseconds for each channel. The energy of each channel is estimated using this sub-sampled data. In a preferred embodiment, the energy of each sub-sampled channel is computed via a mathematical summation over time utilizing a sum-of-squares equation. The computed energies are stored in an associated set of arrays within the high-speed memory storage device of the DSP chip.
For a particular conference, the channels of samples representing the loudest N parties are summed together in real-time. The sum forms the basis for a broadcast output for that conference, and is stored in an array in the high-speed memory of the DSP chip for future computations. This process is repeated for each conference on that DSP chip.
The broadcast conference output is used as a half-duplex conference output as described above. Data is compressed into the required PCM formats after summation. In the preferred embodiment of the expandable telecommunications system, both mu-law and A-law outputs are preferably generated for all conferences at all times. Thus, half-duplex outputs are available in both mu-law and A-law formats. Additionally, the sum is transmitted in the correct encoding format, as PCM data, to the conference participants that are not currently the N loudest.
In the case of a conferee that is actually one of the N loudest, that conferee""s own sample is subtracted from the total sum that was stored before compression, and the result is compressed and written out in the correct PCM format to that particular conferee.
The list of the N loudest channels is updated at regular intervals (T-survey). When updating the list of N loudest, there is a search and replace function. If a new candidate is found to be louder than the current N loudest in the previous interval, then the new candidate becomes one of the N loudest.
In accordance with another aspect of the invention a new energy estimate is checked against the softest of the N loudest energy estimates to avoid having to check against all of the N loudest during every interval. If the new energy estimate is louder than the softest of the then N loudest, then the new channel becomes one of the new N loudest.
The energies of the channels are calculated at a selected rate. In addition, the N loudest sub-samples are updated at regular intervals.
Although the invention will be described herein with regard to an expandable telecommunication system, it should be understood that the invention may be used in connection with other conferencing systems that utilize DSP circuitry for signal processing.