A telephone station set is typically connected to a telephone network via two wires. When a call is placed from the telephone station set, then the network typically converts the two wires to a four-wire connection through the telephone network and then back to two wires extending to a called telephone station. The conversion from two wires to four wires and vice-versa is usually achieved using a so-called hybrid, which is typically a source of echo signals. To deal with echo signals, the four-wire network connection may include a signal processing function to effectively "cancel out" such echo signals. A telephone network may include other signal processing functions, such as the detection of Dual Tone MultiFrequency (DTMF) signals entered by a user.
Presently, a telephone network operates in a so-called Synchronous Transfer Mode (STM) to transport speech signals over a network connection in digital form. That is, the telephone network samples an analog voice signal that it receives at an input at an 8 kHz rate and transports the resulting 64 Kbps signal synchronously over an associated network connection to an output. At the output, the digital signal is converted to an analog signal. Since an STM network operates in a synchronous mode, signal processing functions may be performed synchronously. For example, consider the signal processing function of echo cancellation. As mentioned above, a so-called hybrid is usually the source of an echo, as illustrated in FIG. 1. Assume that speech signals are transported from left-to-right via two wire path 3 into hybrid 5. As a result of an impedance mismatch, hybrid 5 reflects a portion of the speech energy over two wire path 4, thereby causing an echo of the speech signal to be transported in the opposite direction toward the talking party. However, echo canceler (EC) 2 estimates the waveform of the reflected signal carried over path 4 and presents the result to an EC 2 subtracter (sub), which then subtracts the estimated waveform from the actual waveform of the signal carried over path 4. If the estimate is close to the actual waveform, then the echo component of the signal is effectively canceled at the subtracter (sub) of EC 2.
This signal processing function is made possible in an STM network as a result of being able to present the echo signal to the EC 2 subtracter via path 4 at some time after the true speech signal has arrived at EC 2 via path 3. The time difference between the two signals is referred to herein as echo return path delay. An example of such a delay is shown in FIG. 2, in which, digital signal 6 represents a sample of speech that is transported over path 3, and is formed from a number of bits, e.g., eight bits. Digital signal 7, on the other hand, represents an echo of signal 6 that has been reflected in the opposite direction over path 4. As shown, a delay of .DELTA.t exists between the two signals. In addition, the delay is essentially constant over successive pairs of similar signals as a result of the network operating in a synchronous mode. This constant delay, more accurately the time invariant characteristic of the echo path observed by the echo canceler, allows the ready implementation of the cancellation function.
The relevant technology has recently seen the introduction of what is commonly referred to as a Asynchronous Transfer Mode (ATM) network, which is formed from a plurality of ATM switches and other ATM equipment. An ATM network may operate at, for example, a 2.4 GHz clock rate, to transport information in the form of a cell comprising 53 octets of 8 bits each octet. Five of the 53 octets form a cell header including a logical channel identifier. The remaining 48 octets form the cell payload. The relevant technology is now turning toward interfacing a STM network with an ATM network. Accordingly, an interface between an STM and an ATM network at one end of a connection would need to pack a segment of the voice signal comprising 48 octets generated in the STM network into an ATM cell for presentation to the ATM network. Such cells are generated periodically, in which the period is 6 milliseconds and is approximately equal to the amount of time needed to collect from the STM network 48 octets of a voice signal sampled at a 125 microsecond clock rate. The ATM network then transports the cell to the opposite end of the connection, which may also interface with an STM switch. At that point, the interface performs a complimentary function.
Although successive cells corresponding to a voice connection enter the ATM network periodically every 6 milliseconds, various ATM operations, such as cell switching, perturb the temporal location of the voice cells. As a result, when successive cells of the voice connection are observed at a particular point within the ATM network, e.g., where the cells exit from the ATM network, then it may be seen that the interval between cells may not be exactly 6 milliseconds. This departure from strict periodicity is referred to herein as cell jitter.
We have recognized that in certain instances a voice connection through STM and ATM networks may not include conventional signal processing functions, such as, for example, DTMF detection, echo cancellation, signal enhancement, etc. As such, the absence of such a signal processing function may affect adversely the quality of the service. This problem may be dealt with by implementing the signal processing function in the ATM network. That is, converting a received cell whose contents represents speech, or other signals in the speech band, into an STM format, performing the signal processing function and then converting the result to the ATM format. This would entail "depacketizing" the ATM cell by presenting each octet forming the payload sequence to the signal processing function at the STM sampling rate of 8 KHz, i.e., 125 microseconds for each octet. In addition, the incoming data cells would need to be stored in a "smoothing" buffer to deal with cell jitter, which may be as long as two milliseconds, and guarantee that an octet can be presented to the STM signal processing function every 125 microseconds. After the signal processing function is performed, the octets must then be "repacketized" into ATM cells, as mentioned above. It can be appreciated that it would take six milliseconds to accumulate 48 octets to form the payload of an ATM cell. Thus, as a result of such jitter and repacketizing eight milliseconds of delay could be introduced in the delivery of an ATM cell to its intended destination.