1. Field of the Invention
The invention relates generally to the field of communications. More particularly, the invention relates to multiple transmission bandwidth streams with differentiated quality of service.
2. Discussion of the Related Art
The modern Internet has its roots in ARPANET, a network used primarily by academic institutions to link computers. Internet protocol (IP) has its roots in ARPANET and is the predominant choice for Layer-3 protocol suites in modern networks. IP is particularly appropriate for data communication, involving file transfers and other “non-real-time” applications. The Internet, however, is being considered for a variety of applications, including, but not restricted to, real-time applications such as voice communication (VoIP, or Voice over IP). This multiplicity of services, with different needs, is being addressed by protocol enhancements such as DiffServ (for differentiated services), whereby packet streams are identified and processed according to specific needs. Some services, such as file transfer, can tolerate longer time delays and larger time-delay variations than other services, such as VoIP that demand shorter delays and small time-delay variation. Such service attributes are collectively referred to by the term Quality of Service (abbreviated QoS). In particular, a small time-delay variation is associated with a high QoS and a large time-delay variation associated with a low QoS. Whereas QoS is a generic term and is the amalgamation of various service attributes, the primary attribute of relevance here is time-delay variation.
Traditional, and proposed, methods for providing differentiated services have been proposed (e.g. DiffServ) whereby packets associated with a service requiring higher quality of service are identified and assigned a higher priority as well as preferential treatment for transmission. One enhancement to this traditional approach has utilized the notion of absolute time for scheduling packet transmission. The scheduling approach involves segregating services based on QoS requirements and blocking out periodic intervals of time wherein transmission of only those packets requiring high-QoS can be initiated.
The preponderance of modern networking proposals and architectures are based on packet switching as is evident from the spectacular growth of the Internet. Most packet switching schemes are based on IP which is a set of protocols associated with the packetization of traffic information and the associated routing methods. This is in contrast to the traditional or legacy methods based on circuit switching. Probably the most fundamental distinction between the two architectures is that in IP networks the information elements, namely packets, “find” their way from source to destination and different packets associated with the same transaction may follow different routes. In contrast, in circuit-switched networks, a path (or “circuit”) is established first and the information associated with the transaction follows the same path through the network. In circuit-switched networks each “call” is guaranteed to have adequate transmission bandwidth to assure a constant bit-rate and traffic remains in-sequence. In contrast, in packet-switched networks it is difficult to guarantee a constant bit-rate (without much bandwidth over-subscription), packets may be delivered out-of-sequence, the transmission delay is not fixed, and there may be significant transmission delay variation from packet to packet. An excellent treatment of communication protocols and methods is provided in [Ref. 1.1] and specifics related to IP are described in [Ref. 1.2].
Whereas packet-switched networking may have some significant advantages relative to circuit-switched networking, there are some disadvantages, primarily related to Quality of Service (“QoS”). Whereas the term QoS may evoke numerous and varied interpretations, the term is used here in a somewhat narrow manner. In particular, the notion of QoS, for the purposes of this discussion, is limited to a measure of time-delay variation. “High” QoS implies that the traffic is delivered from source to destination with a small time-delay variation (often called “jitter”); “low” QoS implies that the time-delay variation is not guaranteed to be small. Note that low QoS does not imply lower reliability; low QoS does not imply higher packet loss; low QoS does not imply lower throughput; if fact the term low QoS does not relate to a layman's view of low quality; for this discussion low QoS is simply equivalent to uncertainty in transmission delay. Certain types of traffic, such as computer-to-computer communication involving file transfers, can be assigned to low QoS channels with insignificant impact in performance. Other types of traffic, typically time-sensitive traffic such a voice communication, require the channel to have a high QoS. Circuit-switched networks, which “nail” up bandwidth for a given call, generally provide a high QoS but can be viewed as bandwidth inefficient since the particular channel is not available for other traffic even during pauses; packetization is one way to improve transmission facility usage since the overall bandwidth is effectively shared between all active calls.
Time-delay variation in packet-switched networks has several causes. One of the principal causes is the sharing transmission resources. Ironically, sharing of transmission resources is considered one of the principal advantages of packet-switched (as well as cell-switched and frame-switched) networking architectures over traditional circuit-switched schemes. The reason for variable delay in packet-switched networks is best illustrated by a simple example using the configuration of FIG. 1.
The simple network of FIG. 1 comprises two locations, each with its LAN (Local Area Network) and interconnected over a Wide Area Network (WAN) with the WAN segment linking two routers (packet switches), one in each location. The WAN segment could, be, for example, a private line DS1 (often referred to as a T1 link) obtained from a Telecommunications Service Provider (“Phone Company”). That is, the WAN link is equivalent to a channel with bandwidth (i.e. bit-rate) of 1.536 Mbps (conventional DS1 has a bit rate of 1.544 Mbps but very often 8 kbps are used for framing and performance monitoring purposes, leaving 1.536 Mbps for end-to-end communication). Whereas many different types of LANs exist, the most common deployment is Ethernet, so we will assume that the LAN segments are either 10 Mbps or 100 Mbps Ethernet LANs. All traffic between the LAN segments at the two locations traverses the WAN over the DS1 link.
The data to be transferred is in the form of packets; the generic structure of a packet is shown in FIG. 2. Every packet comprises three parts. The first part is the header and the bits in the header provide information on the source address, destination address, protocol used and other such information. The body of the packet is the actual information, often called the payload.
The footer (or “trailer”) is usually a check-sum whereby the transmitter generates a CRC (Cyclic Redundancy Check) code based on the packet content and the receiver does likewise. If the check-sums do not agree then there was a transmission error and the contents of the packet are suspect and the conventional action is to discard the packet. For every protocol (or set thereof) the size of the header and footer (together referred to as overhead) is predetermined (say N bytes). The size of the body, or payload can be variable, though all protocols assign a maximum and minimum size (the maximum size is typically very large compared to N). Clearly larger packets are more efficient in the sense of payload to overhead ratio and thus it is advantageous, from the viewpoint of maximizing transmission bandwidth utilization, to use large packet sizes where possible.
Now suppose that a voice call is made between location A and location B and the method used is Voice-over-IP (“VoIP”). That is, the voice signal is digitized and packetized for transmission over the WAN link. Voice traffic is an example of traffic that requires a high QoS. The voice packetization is accomplished by segmenting the voice signal into “blocks”, typically 10 msec in duration, and generating a packet of information for the block. Considering that speech (telephony) signals are sampled at 8 kHz and use one octet per sample (for “uncompressed” speech), the packet payload requires just 80 bytes to transport the speech samples for a block. The payload size is determined by various factors such as the number of simultaneous voice signals, the level of compression, block size, and other factors but it is generally true that packets for voice (which exemplifies a high QoS requirement) will be “small” and be generated very repetitively. For the sake of this example, suppose the repetition interval is 10 msec and the packet size is 1500 bits (this size is chosen solely to simplify the arithmetic). Each packet thus occupies 1 msec of every 10 msec of the WAN channel (approximated as a 1500 kbps link). In the absence of any other WAN traffic, each packet would go through on time and “on schedule” and there would be no time-delay variation (corresponding to a very high QoS). If we assume that the LAN operates at 10 Mbps, each voice packet occupies just 0.15 msec of every 10 msec of the LAN segment. A simplified time-and-event diagram of the situation is depicted in FIG. 3.
As shown in FIG. 3, voice packets arrive at the router on the LAN side every 10 msec and, after a small delay (for processing and reformatting as required), appear on the WAN transmission link with an inter-packet interval of 10 msec. The layer-2 (also called “link layer”) processing ensures that when actual data is not available for transmission, idle cells or idle flags are generated to keep the WAN link “alive”. (The information packets are generally associated with Layer-3 of the data communication model. The layer-2 reformatting is sometimes necessary to address the matching of layer-3 to the physical layer, namely layer-1, corresponding to the actual transmission scheme. A brief description of layer-2 is provided below.) (A comprehensive treatment of layered communication is provided in [Ref. 1.1]).
To see the impact of additional traffic on the voice stream, consider the hypothetical case a concurrent file transfer. Assume that the file-transfer application generates packets of size 15,000 bits (which is not a large packet for file-transfer) and consider the impact of just one data packet on the voice transmission performance. A data packet takes 10 msec on the WAN link. A simplified time-event depiction of the impact of this single data packet is shown in FIG. 4.
Considering the WAN link, since the data packet occupies 10 msec and the voice packet occupies 1 msec, the time-separation between the first two voice packets shown is greater than 11 msec; similarly, assuming that there was just this one data packet, the time separation between the second and third voice packets shown could be less than 9 msec. Considering that the normal separation between voice packets is 10 msec, the configuration as shown introduces a time-delay variation of 1 msec. Depending on the precise ingress time of the data packet from the WAN into the router, this time-delay variation could be much larger. The delay-variation problem is only exacerbated if the size of the data packet is larger and can be devastating if the number of data packets is significant (unless other actions are taken to “assist”, to the extent possible, the voice packet stream).
In actual practice, the network may comprise multiple routers and multiple paths between locations A and B, especially if the link between the two customer-premise-located routers is achieved using the public internet. A particular voice packet stream will experience time-delay-variation pressure in each transmission segment that the packets traverse. Generally speaking, the following rules of thumb apply.
Time-delay-variation is caused and/or exacerbated by the following factors:
1. Congestion; packets traversing a transmission segment that is highly loaded will be delayed by varying amounts, the delay variation increasing with congestion.
2. Packet size; a packet stream sharing bandwidth with other streams will be impacted by the size of the packets of the other streams it is sharing transmission bandwidth with. In particular, if the packet size of the other stream(s) is large, the packet stream under consideration will experience significant time-delay variation.
The following are, generally speaking, the characteristics of different packet streams:
1. Streams requiring high QoS (i.e. low time-delay variation) are usually associated with real-time communication, such as voice. Packets are generally small but are regularly spaced. The average bit-rate is “small” but uniform. Loss of a packet is generally ignored and the concomitant impact on the information signal (such as the speech) is “accepted” albeit highly undesirable.
2. Streams that can tolerate a low QoS (i.e. a large time-delay variation) are usually associated with non-real-time communications, such as computer-to-computer file transfers. Packets are generally large and the traffic is “bursty” with packets closely spaced during actual information transfer and sparse otherwise; the notion of average bit-rate is not that relevant since bursts of information are interspersed with intervals, possibly long, of little to no information. The loss of a packet is detected by higher layers and a request for retransmission is sent.
Conventional Approaches to Providing Variable QoS
In order to describe conventional approaches to “solving” the QoS problem, we first need to recognize the general working of a packet-switching device, namely a router. A simplified diagram of the WAN port of a router is depicted in FIG. 5. A typical router (packet switch) may have a multiplicity of WAN interfaces (for Inter-Machine trunks) as well as one or more LAN interfaces. These interfaces provide for the ingress and egress of packets. The principal function of the router is to process incoming packets, discard packets if necessary, and decide which egress port each packet must be forwarded to for outbound transmission.
With reference to FIG. 5, the block labeled Packet Processor is where the processing associated with the protocol suite is performed. Modern implementations use software stacks running on high-powered microprocessors which are specially designed to have “hardware assist” for the types of operations that need to be performed (such devices are often called Network Processors). With reference to the layered model for data communications, the packet processor performs the Layer-3 (and sometimes higher layers as well) functions as well as, possibly, functions associated with the control plane (network management tasks). For a given packet, using routing tables and other sophisticated techniques, the packet processor determines the egress port and places the packet in an outgoing queue for subsequent outbound transmission. This determination is also depicted in FIG. 5. The queue is nominally equivalent to a FIFO (first-in-first-out) buffer, but the IP protocol suite does not require cells to be transmitted (or received) in order and thus the queue does not necessarily have to be FIFO.
The block labeled Layer-2 processing extracts packets from the transmit queue and prepares them for outbound transmission. The layer-2, or link layer processing has multiple functions.
Historically, when physical transmission media were not as advanced as today and often had “high” bit-error rates, the link layer was responsible for error detection as well as requests for retransmission; the intent was to provide the higher layer data that was substantially error free (albeit with “gaps” and “delays”). One benefit of this historical approach was economy since the higher layer processing was “slow” and “expensive”. Modern network processors are fast and inexpensive and this historic benefit of the link layer is moot. In modern packet-switched networks, the link layer rarely is responsible for retransmission requests, this function having migrated to higher layers. A second function of the Layer-2 processing block is to generate data streams (bit streams or octet streams) that are matched to the needs of the physical medium. For example, if the physical medium corresponds to a DS1 (or T1) link, the line bit-rate is 1.544 Mbps and the payload bit-rate is 1.536 Mbps. The Layer-2 processing block must provide the necessary “fill-in” units such that the bit stream (octet stream) provided to the physical medium corresponds to 1.536 Mbps.
The most prevalent choices for Layer-2 are HDLC (high-level data link control) and ATM (asynchronous transmission mode). HDLC is a formatting method that takes the packet and treats it as a payload unit. The flavor(s) used in telecommunications are specified in [Ref. 1.1, 1.3, 1.4, 1.5, 6.1, 6.2]. The payload unit is encapsulated in an HDLC frame. The frame size can be variable and the frame includes the payload, a header that can be used for addressing purposes and a trailer that provides a CRC check-sum for error detection purposes. Frames received with incorrect CRC check bit-sequences are discarded. Typically, each IP packet is encapsulated in one HDLC frame (as the payload). The fill-in unit defined for HDLC is the flag, corresponding to an octet with bit pattern 01111110. Care is taken, by appropriately inserting “0” bits, that the frame (header, payload and trailer) does not contain a pattern that could be confused with a flag. There is typically at least one flag between successive frames. HDLC is used in frame-relay networks wherein the transmission across the network encounters frame-relay switches, permitting the service provider to route the frames (i.e. HDLC traffic units) to the appropriate destination, thereby providing virtual circuits (VCs). Such “Layer-2” networks are quite popular in North America. If the two ends of the bit-stream of the physical medium are both routers (i.e. the bit-stream corresponds to an inter-machine trunk), then the notion of Layer-2 networking is moot. Layer-2 HDLC framing, with its ability to distinguish frames based on addresses, can be utilized to advantage even in the point-to-point case of inter-machine trunks.
ATM utilizes a format where each cell is a fixed size, namely 53 octets. A comprehensive treatment of ATM is available in [Ref. 5.2]. Of these 53 octets in an ATM cell, 5 octets comprise the header and 48 octets used for the payload. The header includes addressing information in the form of VPINCI (Virtual Path Identifier and Virtual Circuit Identifier). Furthermore, the header contains one octet for protecting the information content of the header but the 48-octet payload unit does not have any error checking mechanism. A cell is discarded if the header error checking detects a “fatal” error; payload errors can go undetected but higher layer protocols usually are geared to address this problem. The procedure for formatting the payload, in this case data packets, into cells is determined by rules referred to as an ATM Adaptation Layer or AAL. AAL5 is one form of AAL (AAL0, AAL1, and AAL2 are the other common AAL types) suitable for data transmission of packets. For reference, the method most often used for formatting constant-bit-rate streams into ATM cells is AAL1 (see, for example, [Ref. 5.3]); the method most often used for formatting bit-streams associated with speech into ATM cells is AAL2 (see, for example, [Ref. 5.4, 5.5]); the term AAL0 is used when the rule is proprietary or not known by any devices except the end-points. The fill-in unit is an idle cell, easily distinguished by information in the header. ATM is used in cell-relay networks wherein the transmission across the network encounters ATM switches, permitting the service provider to route the cells to the appropriate destination, providing virtual circuits (VCs). Such “Layer-2” networks based on ATM are quite popular in Europe as well as North America. Again, if the two ends of the bit-stream of the physical medium are both routers (i.e. the bit-stream corresponds to an inter-machine trunk), then the notion of Layer-2 networking is moot. Layer-2 ATM formatting, with its ability to distinguish cells based on addresses, can be utilized to advantage even in the point-to-point case of inter-machine trunks.
The Layer-2 processing block provides the physical medium dependent (PMD) processing block with the outbound data stream. The PMD processing block adds any overhead necessary prior to transmission. For example, the physical medium could be a T1 line and, in this case, the PMD function would comprise the DSU/CSU function, including formatting the 1.536 Mbps from the Layer-2 processing block into a 1.544 Mbps DS1 signal with the addition of either D4 or ESF framing, and outputting the resultant bit-stream as a bipolar AMI/B8ZS signal. The inbound direction can be described in a similar manner.
The PMD processing involves extracting the 1.536 Mbps DS1 payload from the incoming T1 signal and presenting it to the Layer-2 processing block. The packets are extracted from the
HDLC frames or ATM cells and are placed in the receive queue from which the packets are extracted by the packet processor block for further processing. Generally speaking, the receive direction does not introduce any detrimental effects related to time-delay variation (i.e. QoS) since the packet processor block is usually sufficiently powerful to handle packets at the rate at which they enter.
Assigning Priority to Packets for QoS
As mentioned above, the transmit queue does not necessarily have to exhibit a FIFO behavior. A suite of protocols at the IP layer have been developed, generically referred to as DiffServ (for differentiated services) that deal with the assignment of priorities to packet streams. Packets associated with streams that require a high QoS are assigned a higher priority than packets associated with a stream for which high QoS is less important. Whereas this functionality could be associated with the packet processor block, it is illustrative to show the functionality of a queue manager explicitly, as depicted in FIG. 6. A comprehensive view of DiffServ can be obtained from [Ref. 3.1 through 3.15].
Whereas FIG. 6 depicts the queue manager block as handling packets in both the transmit as well as receive directions, if the packet processor block can handle incoming packets “in real-time”, the queue manager has less of a role to play in the receive direction. The principal function of the queue manager, in the transmit direction, is to choose the next packet for transmission. The rationale is quite straightforward. The highest priority packet in the queue, assuming it is nonempty, is chosen. When used in this manner, the queue is, strictly speaking not FIFO (so the nomenclature of “queue” may be somewhat of a misnomer) but does provide the mechanism whereby transmission of a packet associated with a high-QoS service can precede the transmission of a lower-QoS packet even though the latter was generated and placed in the buffer earlier than the former.
Many variations of this simple technique can be promulgated, including methods whereby the priority of a packet is (artificially) increased based on the time spent in the queue. Such a mechanism may be required to prevent a high-QoS stream from blocking a low-QoS stream entirely.
Combination of Priority and Scheduling for Providing QoS
Using a priority based scheme for providing differentiated services and attempting to maintain a high QoS for streams that require it is now well established and forms the basis for just about every approach for providing the requisite QoS. One technique for enhancing this scheme is based on scheduling. A more complete description of time-scheduling for providing QoS differentiation is provided in U.S. Pat. Nos. 6,038,230; 6,259,695; 6,272,131; 6,272,132; 6,330,236; 6,377,579; and 6,385,198 [Ref. 4.1 through 4.7] and citations therein.
Consider the situation where packet streams are generically considered “high-QoS” and “low-QoS”. The streams classified as high-QoS are typically those associated with a constant bit-rate or low-delay-variation service and we have pointed out earlier that such streams use packets that are usually small and typically very regularly spaced in time. The streams classified as low-QoS are typically those associated with computer-to-computer communication involving (large) file transfers and use packets that are usually large and often irregularly spaced in time (note that the term low-QoS does not imply that the streams require low QoS but, rather, that they can tolerate large time-delay variation). This situation is depicted in FIG. 7. For convenience, only the transmit section is depicted in FIG. 7.
An important element shown in FIG. 7 is the function of a time interval manager block. The time interval manager establishes a periodic interval structure whereby time is “blocked” out into intervals that can be termed “high-priority” and “low-priority”. For example, time may be split into 10 msec intervals and the first 1 msec is considered “high-priority” and the remaining 9 msec is considered “low-priority”.
The simplest scheduling approach for providing differentiated QoS is to restrict the initiation of packet transmission based on interval and priority. In particular, in the high-priority interval, only packets from the high-QoS queue can be initiated. Transmission of packets in the low-QoS queue can be initiated only during the low-priority interval. A refinement of this simple approach that addresses congestion levels is to permit initiation of transmission of packets in the high-QoS queue during both high-priority intervals and low-priority intervals. Further, the scheduling method can be applied in conjunction with the priority approach, whereby the high-QoS and low-QoS queues can in turn have packets of differentiated priorities (again the queues are not necessarily FIFO).