1. Field of Invention
This invention relates to integrated circuits (ICs) for voice communications, and more particularly, to a highly integrated processor for processing and routing voice traffic conforming to various standard protocols over a digital network. As disclosed herein, special memory management hardware associated with the processor efficiently implements protocol encapsulation, with minimal intervention on the part of the processor.
2. Description of Related Art
In its early history, voice communications consisted of analog signals transmitted over comparatively short distances. As telephone use increased and long distance connections became commonplace, it was necessary to find a way to transmit multiple voice signals over a common line. A communications network in which every 2-way conversation is allotted its own line works well enough for a small number of users, separated by short distances. But as the number of users increases, the amount of material and labor required to install and maintain a dedicated line for each user quickly becomes prohibitive. The alternative was to transform voice signals into digital form and combine their digital representations using a technique known as time division multiplexing (TDM).
A simple example illustrating the technique of time division multiplexing (TDM) is presented in FIG. 1. In this example, four different voice signals from sources A-D are to be transmitted across a single wire to a remote destination. In the first stage 24 of this process, the voice signals are digitized by analog-to-digital (A/D) converters 10A-D. In other words, each of the continuous signals A-D is periodically sampled and represented by a binary number denoting the approximate voltage of the sample. In FIG. 1, the samples for waveform A are represented by solid circles, while those for waveforms B, C and D are represented by hollow circles, hollow squares and solid squares, respectively. The individual samples in each sequence may be denoted by the letter associated with the source, with a subscript for the sample number. For example, the samples in the sequence derived from source B would be denoted B0, B1 . . . Bn.
The resulting sample sequences 26 must contain sufficient information to reconstruct the original waveforms at the destination. According to the Nyquist Theorem, this requires that each waveform be sampled at a rate greater than twice the highest frequency present in the waveform. For example, a signal containing frequencies of up to 1 KHz must be sampled at a rate greater than 2 KHz, to permit the signal to be reconstructed from its discrete samples. In the case of standard voice communications, signals are assumed to be band-limited to about 3 KHz, so a sampling rate of 8 KHz is used. This implies that the sample interval (i.e., the time interval between any two adjacent samples) in the sequences 26 is 125 μs.
A multiplexer 12 combines the four sample sequences 26 into a multiplexed sequence 26. Two characteristics of this multiplexed sequence are particularly noteworthy: In the first place, the original four sample sequences are interleaved to create the multiplexed sequence. Thus, the sample order in the multiplexed sequence is:A0, B0, C0, D0, A1, B1, C1, D1, . . . , An, Bn, Cn, DnNote that this preserves the original order of the samples. In the second place, the effective sample rate in the multiplexed sequence is four times that of the original sequences. Within each 125 μs sample interval, the multiplexer 12 must collect a new sample from each of the four sources and transmit all four samples. Consequently, the samples in the multiplexed sequence 28 are separated by 31.25 μs, for an effective sample rate of 32 KHz.
The multiplexed sample sequence 28 is typically buffered by a high-speed amplifier, which drives the impedance of the wire, cable, transmission line 16, etc. used to convey the sequence to the desired remote destination. At the destination, another amplifier receives the signal from the transmission line 16 and conditions (filtering, glitch suppression, etc.) it before presenting it to the input of a de-multiplexer 20. The de-multiplexer 20 reverses the operations performed by multiplexer 12, to extract the original four sample sequences 26 from the multiplexed sequence 28. Each of the resulting sample sequences may then be acted upon by a digital-to-analog (D/A) converter 22A-D to reconstruct the respective voice signals 30.
In the preceding example, only four signals were multiplexed. However, the TDM principle can clearly be extended to transmit greater numbers of voice signals over a single line. In fact, the upper limit on the number of voice channels that can be carried is related to the amount of available bandwidth, commonly stated in terms of the maximum bits per second (bps) sustainable by the hardware. Along with the number of signal sources (or, channels) and the sample rate, the bandwidth required for a TDM transmission depends on the number of bits per sample. For voice communications, signals are usually digitized to 8 bits. Thus, the bandwidth required can be expressed as: bandwidth (bps)=no. of channels×no. of bits per sample×sample rate. The original T-carrier system developed in the 1970's allows for 24 voice channels to be multiplexed onto a single line, using the techniques described above. If each channel is sampled with 8-bit resolution at a rate of 8 KHz, the TDM bandwidth required is: 24×8×8000=1.536 Mbps. The original T1 standard defines a data structure known as a D4 frame for the transport of TDM data. A D4 frame consists of 24 consecutive 8-bit samples (one from each voice channel), preceded by a framing bit. Note that the addition of the framing bit alters the previous TDM bandwidth calculation. Since each frame consists of 24×8+1=193 bits, and frames are transmitted at 8000 frames per second, the bandwidth becomes: (24×8+1)×8000=1.544 Mbps. The framing bit follows a special pattern called the frame alignment signal, which repeats every 12 frames. The group of 12 consecutive frames bounded by this frame alignment signal is known as a superframe.
T1 performance is easily achieved with today's technology, and the demand for greater bandwidth soon led to the introduction of other standards, embodied in the following digital signal hierarchy (DSH):
DS LevelNorth American BandwidthVoice ChannelsT-CarrierDS064Kbps1DS11.544Mbps24T1DS26.312Mbps96DS344.736Mbps672T-3Thus, for example, a single T-3 line supports 672 DS0 voice channels.
As described above, voice signals are typically encoded using time division multiplexing (TDM) for transmission over the telephone network. However, there is an undesirable characteristic of TDM, which potentially reduces its efficiency. Under TDM, the mapping of time slots to voice channels within a frame is fixed. Consequently, a time slot allotted for a particular voice channel may go unused, if the signal source for that channel is inactive during its time slot. This typically occurs with “bursty” signals, which consist of active signal intervals separated by periods of inactivity. Significantly, normal speech is a bursty signal. With such signals, the frame may be transmitted with less than its full capacity, since many of its timeslots may contain samples collected during a period of inactivity. An approach that overcomes this limitation is asynchronous transfer mode (ATM). Asynchronous transfer mode (ATM) is a switching technology that organizes digital data into 53-byte cells for transmission over a physical medium. Each cell consists of 1 5-byte header and a 48-byte payload, containing the actual data to be transmitted. Individually, a cell is processed asynchronously relative to other related cells and is queued before being multiplexed over the transmission path. ATM presents the cells (containing the voice samples) to the network whenever there is enough bandwidth available to handle them. In this sense, the voice data transfer is asynchronous relative to the generation of the original voice signal. In addition to voice, ATM supports various other types of signals and data, including video and multimedia applications. In an ATM network, data must be divided into cells before transmission and reconstituted from cells upon reception. This is known as segmentation and reassembly (SAR), and is typically handled by a hardware device (i.e., electronic circuitry).
Note that a T1 connection directly routes frames from a source to designated destination. In contrast, ATM allows flexibility in the choice of a connection path. The 5-byte header within each 53-byte ATM cell contains a virtual path identifier (VPI) and virtual channel identifier (VCI). The VPI and VCI are used to route the cell to its intended destination. This allows the ATM switching hardware to efficiently allocate connection paths based on the level of activity in the voice channels. Because the cells are always the same size, dedicated hardware designs for high-performance ATM switches are relatively straightforward. As a result, ATM networks can operate at speeds greater than 155 Mbps.
Voice data formatted as ATM cells can be transmitted over a T1 connection by using a network adaptor. The network adaptor converts the 53-byte ATM cells into a sequence of samples, which are assigned to the timeslots within three frames (since each frame contains 24 bytes of data, the 53 cells must be spread over three frames). This process can also be reversed to generate ATM cells from T1 frames. The conversion between ATM and T1 data formats can be employed to efficiently route voice traffic through the telephone network.
Within the telephone system network, a central office (CO) is an office local to a group of subscribers (i.e., telephone system users). Home and business lines are connected to a CO by what is called a local loop. The local loop connection is usually on a pair of copper wires called twisted pair. The voice signals from each subscriber are typically in analog form (i.e., continuous) over the local loop, but are transformed into digital data at the CO. The CO also has switching equipment that can switch calls locally or to long-distance carrier phone offices. The conversion from T1 to ATM is useful for combining a large number of voice channels to be transmitted over a long distance by a high-bandwidth link (such as optical fiber) connecting one central office to another within the telephone network.
Normal voice communications is connection-oriented. That is, a connection between the talker and the listener must be established before voice data is transmitted. In contrast, data communication networks, such as the Internet, or a local area network (LAN) in an office, are inherently connectionless. The model for such networks is that of a single communications line, shared by several nodes. Connectionless network service does not predetermine the path from the source to the destination system. Messages are sent out on the shared line in the form of packets (also known as datagrams). Each packet is directed to a particular node through the inclusion of the recipient's address in header information associated with the message. The header precedes the portion of the packet containing the data being transmitted (i.e., the payload). Packets must be completely addressed because different paths through the network might be selected (by routers) for different packets, based on a variety of influences. Each packet is transmitted independently by the source system and is handled independently by intermediate network devices. The connectionless mode of operation is more appropriate for many types of data communication. For example, when sending an email message out over the Internet, it would be inconvenient to require the intended recipient of the email to have previously established a connection channel through which to receive the email.
Voice data may be formatted to allow transmission over a connectionless network by segmenting the data into appropriate-sized frames, prefixed with the required header information. This conversion is termed data encapsulation. Data encapsulation could be necessary, for example, at the interface between the public switched telephone network (PSTN) and an optical fiber-based LAN. The Transmission Control Protocol/Internet Protocol (TCP/IP) suite, described below, may be used for the encapsulation and delivery of voice data over a connectionless network. The function of the various protocols in the TCP/IP suite may be understood with reference to the following open systems interconnect (OSI) 7-layer model.
(2)(7)(6)(5)(4)(3)DATA(1)APPLICATIONPRESENTATIONSESSIONTRANSPORTNETWORKLINKPHYSICALLAYERLAYERLAYERLAYERLAYERLAYERLAYEREmailHTTPPOP3TCPIPEthernetADSLFile TransferFTPIMAPUDPATMSLIPcoaxial cableWeb ApplicationsTelnetMACRTP
In the OSI model, the process of communication between two computers connected by a telecommunication network is divided into layers (also described as a stack). When a message is transmitted from one computer to the other it passes down through the various layers on the sender's side of the network, and back up through the protocol layers when it is received at the receiver's side.                (1) The Physical Layer is the lowest level of the OSI model, and the protocols here define actual physical medium for the transport of a bit stream from one point in the network to another.        (2) The Data-Link Layer defines the access strategy for the physical medium, and pertains to hardware devices such as network interface cards (NICs), routers and bridges.        (3) The Network Layer governs the routing and forwarding of data through the network.        (4) The Transport Layer provides error-checking and ensures that all the data sent have been received at the destination.        (5) The Session Layer coordinates exchanges between two computers over the network to ensure that the connection is preserved until the transaction is completed.        (6) The Presentation Layer, usually part of an operating system, is the point at which data sent is rendered into a format usable by the recipient—e.g., transformation of a byte stream into a displayable image.        (7) The Application Layer is the layer at which network-oriented applications programs reside—these applications are the ultimate target of the message transmitted by the sender.        
The IP is a Layer 3 protocol, most familiar as the protocol by which data is sent from one computer to another on the Internet. Each computer (known as a host) on the Internet has at least one IP address that uniquely identifies it from all other computers on the Internet. When data is sent or received (for example, an e-mail note or a Web page), the message gets divided into packets, each of which contains both the sender's and the receiver's Internet address. Packets are first sent to a gateway computer that directly accesses a small neighborhood of Internet addresses. If the destination address is not directly accessible to the gateway computer, it forwards the packet to an adjacent gateway. This process continues until one gateway recognizes the packet as belonging to a computer within its immediate neighborhood or domain. That gateway then delivers the packet directly to the computer whose Internet address is specified.
IP is a connectionless protocol, which means that there is no continuing connection between the end points that are communicating. Each packet that travels through the Internet is treated as an independent unit of data without any relation to any other unit of data. Consequently, the packets comprising a message may take different routes across the Internet. Furthermore, packets can arrive in a different order than that in which they were sent. The IP accounts for their delivery to the correct recipient, but does not manage the delivery sequence. In the context of the Internet, the Layer 4 Transmission Control Protocol (TCP) is generally relied upon to arrange the packets in the right order, and the two protocols are often jointly referred to as TCP/IP. An alternative to TCP (also at Layer 4) is the User Datagram Protocol (UDP), which offers a limited amount of service when messages are exchanged between computers in an IP-based network. Like TCP, UDP uses the IP to actually get a packet from one computer to another. Unlike TCP, however, UDP does not provide the service of dividing a message into packets and reassembling it at the other end. However, UDP does provide port numbers to help distinguish different user requests and, optionally, checksum capability to verify that the data arrived intact. UDP is used by applications that do not require the level of service of TCP or that wish to use communications services not available from TCP
Realtime transport protocol (RTP) is an IP-based protocol providing support for the transport of real-time data such as video and audio streams. A Layer 4 protocol, RTP provides time-stamping, sequence numbering and other mechanisms related to managing timing issues in such data. The sender creates a timestamp when the first voice signal sample in a packet is collected, and this timestamp is then attached to the data packet before sending it out. The receiver may use this information to assemble the packets in their correct sequence, or to synchronize one packetized data stream with another—for example, in the case of transmitted audio and video data from a movie. RTP also provides other services, such as source identification. Using the source identifier in the RTP header of an audio packet exchanged during a video conference, for example, a user can identify who is speaking.
Information required by each protocol is contained in a header attached to a data packet as it makes its way through the network. Header information associated with the protocols at different OSI layers can be nested. For example, data sent from an application may begin as an RTP packet: As the packet moves down through the OSI layers to be transmitted over the physical medium, a UDP header is prepended, followed by an IP header: It is often necessary to transform voice from the connection-oriented frame-based TDM format used by the PSTN to a connectionless cell-based format, such as that used by ATM, or a packetized format such as used by an Ethernet network. A significant effort in such transformations is devoted to the preparation and attachment of header information to the data.
Because of the various formats in which voice data may be transmitted and received, the central offices and other nodes within the telephone network are equipped with extensive electronic circuitry to translate and reroute the myriad types of voice signals. Because of the specialized nature of these functions, they are typically performed independently by discrete rack-mount circuit cards and modules. The consumption of power and space attributable to this aggregation of electronic devices is a serious problem. Excessive heat generation and its impact on system reliability, are a related concern.
In view of these problems, it would be desirable to have a system capable of efficiently translating between the major protocols for voice signals, which could be easily incorporated as a functional module within an integrated circuit. This system should be capable of segmenting data and providing headers to allow translation between frame-based data formats (such as TDM frames) and packet-based or cell-based data formats (such as RTP packets or ATM cells). In an integrated circuit implementation, the system should be capable of performing these functions without relying on an external processor.