Telephony switching architectures are the framework for establishing voice communications between users conducting a conversation over the telephone. Traditionally, telephone switching systems have been dedicated devices which serve only the purpose of establishing and releasing telephone connections. However, recent developments in telecommunications merged telephone switching services with computer networks and services. The result, Computer-Telephony Integration (CTI), provides many advantages, such as the ability to program the telephone system components, and the ability of a called party or a system at a called number to respond to an incoming call based on the Caller Identification information provided for the call.
Current telephony switching architectures employ synchronous networks to connect geographically distributed telephony servers, nodes, systems, and parties. Conventional synchronous networks carry data in a very controlled manner: packets must be transmitted at specified, fixed time intervals, even if there is no data to be transmitted. This can result in inefficiency because some of the bandwidth may be consumed by empty packets being transmitted, and the transmission of empty packets limits the actual data transfer rate through the network. Thus, while conventional synchronous network protocols are useful for the synchronous data flow needs of voice and video communications, they do not make maximum use of the available bandwidth.
The Asynchronous Transmission Mode (ATM) protocol is a network protocol which can provide increased efficiency. The specifications for the ATM protocol are available from the ATM Forum, Mountain View, California, USA. ATM is an asynchronous, high bandwidth, low-delay, packet-like switching and multiplexing technique. Generally, ATM transmits information in 53-octet, fixed-size cells, typically consisting of a 48-octet payload field and a 5-octet header field. However, some adaptation layers deviate from this. ATM allocates bytes (bandwidth) on demand to the services utilizing the ATM link, thus allowing it to be more efficient than synchronous network protocols. The ATM protocol provides multiple packet formats, which are referred to as "adaptation layers," and are used for transmitting different kinds of data.
ATM Adaptation Layer AAL1 is a constant bit rate (CBR) protocol. The AAL1 standard provides for a 7 byte header, and a 46 byte packet data unit (PDU) payload (a "P format" cell), or a 6 byte header and a 47 byte PDU (a "non-P format" cell). "P format" cells alternate with "non-P format" cells. There is no integrity check on the data. This layer is designed for circuit emulation and it is assumed that the data is being sent to one or more devices which can "interpret" it (e.g., convert it into sound), and which do not request retransmission of the data. This standard is used for data that is connection-oriented and delay-intolerant.
Voice signals and video signals are isochronous, meaning that the reproduced signal must be recreated at the same frequency at which the original signal was captured. The isochronous data from voice and video signals is therefore intolerant of variations in transmission delay. Failure to recreate the signal at the original signal can cause the reproduced signal to be distorted or even to be incomprehensible to the person receiving the transmission. The AAL1 standard is thus particularly important to the telephony industry because it reduces the variability of the transmission delay. The CBR format is also the highest priority adaptation layer available to user data. The CBR format thus allows voice and other isochronous signals to be transmitted over an ATM link, and still be meaningfully reproduced at the receiving end. Under the CBR format, the PDU from the source is encapsulated and a time-tag is then added as a header or trailer to the PDU. The time-tag is used to maintain a constant timing relation between the source and destination. The PDU and the time-tag form a packet which is then segmented into ATM cells.
ATM Adaptation Layer AAL5 is an available bit rate (ABR) or Variable Bit Rate (VBR) protocol. The AAL5 standard provides for a 5 byte header, and a 48 byte PDU. This layer is designed to transfer non-synchronous, delay-tolerant data, such as network management data, files, documents, applications, spreadsheets, records, etc. Unlike the CBR format, the ABR format contains no provision for maintaining the constant timing relation between the source and destination. However, this format is designed for maximum throughput efficiency, and is particularly well suited for transmitting many types of computer data. The ABR format adds a minimal header or trailer to the protocol data unit before segmenting it into ATM cells.
Thus, input data may be categorized according to whether there is a need to maintain a constant bit rate between the source of the data and the destination of the data. The input data are encapsulated according to the appropriate adaptation layer for that type of data, and are segmented into payloads which are to be transmitted as ATM cells over an ATM link.
However, there are problems with applying the ATM technology in the area of telephony. The ATM specifications prohibit the transmission of empty cells. Therefore, if AAL1 is to be used to transmit voice, then the digitized voice signals must be buffered until the payload of the CBR format is filled before the asynchronous ATM cell may be transmitted. The industry standard method of digitizing voice signals, Pulse Code Modulation (PCM), operates by sampling the analog voice signal once every 125 microseconds, or 8000 times per second. Each sample is converted to an 8-bit (1 octet) digital voice word. Consequently, it takes 5.75 milliseconds to fill a 46 octet PDU (46 octets times 125 microseconds per octet=5.75 milliseconds). In and of itself, this 5.75 millisecond delay might not be a major problem if it were a fixed delay.
However, telephony network communication lines are not unidirectional lines, but are bidirectional lines where signals flow in both directions along the lines. Thus, a device will send signals and receive signals on the same line. Bridged taps, wire gauge changes, and wire splices can cause a portion of the transmitted signal to be reflected back to the transmitting end. This is referred to as an echo. This reflected transmitted signal can be mistaken for, or can distort, the incoming received signal. This may not be much of a problem when the "device" which sends and receives the signal is a human and an analog line is used. However, an echo can be disconcerting even to a human if the echo is too loud or is too delayed. Further, an echo can be a serious problem for digital data transmission devices. The reflected signal will change the phase and/or amplitude of the incoming signal, and can alter or completely destroy the information sent by the other device. Thus, echo adversely affects the validity of received digital data. Further, even if a human is the end "device", echo is a problem because the voice signal is converted to a digital signal for transmission, and the echo may cause the incoming signal to be a series of bits which do not produce a meaningful sound.
Therefore, data transmission devices, such as modems and telephone systems, typically employ some form of echo suppression. Echo suppression is implemented by the transmitting device storing or remembering the transmitted signal, and then subtracting a portion of this stored transmitted signal from the received signal so as to cancel any echo. The echo suppression capability of the transmitting device is limited by how long the device retains the transmitted signal.
Conventional off-the-shelf echo removal devices (hardware and software) are designed to correct for echoes if the round-trip delay of the voice signal is 2 milliseconds or less, a design choice based on cost and other considerations. These off-the-shelf devices are not designed to remove echoes where the round-trip delay is greater than 2 milliseconds. In the example given above, it takes 5.75 milliseconds to fill an ATM cell. Thus, the 2 millisecond limit will have been exceeded before the ATM cell is even transmitted. Therefore, there is a problem with sending ATM cells quickly enough to meet the requirements of voice transmission while not sending cells which have only a single voice sample.
Therefore, a need exists for adapting ATM technology to the area of telephony such that there is a continuous flow of PCM samples between transmitter and receiver, with no more than 2 milliseconds of transmission delay, and with no empty packets.
A problem with adapting the ATM technology to the area of telephony is the problem of distinguishing between digitized voice signals and ordinary digital computer data. Although there are clear definitions for the CBR voice format (AAL1) and the ABR data cell format (AAL5), it is frequently difficult to be absolutely certain that an incoming cell is one or the other. This is a problem because an ATM interface can support two or more local devices, and they may receive similar or different types of data. For example, one device may receive CBR data, while the other device receives ABR data. The ATM interface may not be able to easily and quickly identify which device is the intended recipient of a particular received cell. Further, analysis of the incoming ATM cell payload is unreliable and thus does not provide a solution to this problem.
With ATM network interfaces that support both local CBR and local ABR devices, ensuring that an incoming cell is not misdirected away from the intended recipient is very important. Although the recipient of a misdirected cell can disregard that cell without adverse effect, the intended recipient of a misdirected cell cannot recreate that misdirected cell, and so data will be lost, possibly irretrievably.
Accordingly, a need exists for directing both digitized voice data and ordinary digital computer data in a manner which ensures that a cell of received information will always be received by the intended recipient device.