This invention relates to data networks. In particular, this invention relates to a method and apparatus for smoothing data flow through a switching system used in packetized data networks.
Packetized data networks are relatively well known. Such networks include Ethernet networks, Internet Protocol (IP) networks and asynchronous transfer mode (ATM) networks. The data packets carried by these networks typically have some sort of informational header file or header data to which a data payload is attached or appended. The header is formatted to have embedded within it various information that is used to route the associated payload to an appropriate destination. By way of example, FIG. 1 shows an exemplary depiction of an ATM data packet.
An ATM data packet (100) consists of forty-eight (48) informational payload bytes (102) (identified in FIG. 1 as byte octets numbered from 6-53) preceded by a five (5) byte header block (104) to form a fifty-three byte ATM packet (100). The informational payload bytes (102) represent information from some sort of data source, which might include voice information as part of a telephone call, video programming information or raw data exchanged between two computers such as a word processing file for example. The header block (104) includes addressing information that is read by and used by ATM switching systems (not shown) which route the ATM packet (100) through an ATM switching network. Some of the information in the header block 104 enables a switch to determine the next ATM switch to which the packet is to be routed.
FIG. 2 shows exemplary connections to a known prior art ATM switching system 200 as it might be configured in an ATM network. Such a system is typically configured to receive streams of ATM packets from multiple ATM switching systems at the switch input ports 202, 204, 206. When the packets comprising the incoming streams are received, they are routed to a switching fabric 210 from which the packets emerge at one or more ATM packet output ports 212, 214, 216 coupled to different physical transmission paths leading to different ATM switches in the network. Each input port 202, 204, 206 may receive ATM cells that need to pass through any given output port 212, 214, 216. Likewise, each output port 212, 214, 216 may receive ATM cells from any given input port 202, 204, 206.
Depending on the design of the switching fabric there may be points, referred to here as contention points, where more packets may arrive then may leave in a given time interval. At these points buffers are used to store the data until it can be forwarded. If too much data must be stored then one of these buffers may overflow and the data is lost. For example data packets can be lost in the switching system 200 if too many data packets destined for the same output port 212, 214, 216 are sent into the switch fabric 210 too quickly from the input ports 202, 204, 206. Fixed length internal data packets are forwarded from contention points at a fixed rate. If multiple internal data packets converge on any point in the switch fabric faster than they can be forwarded, then some of them must be queued in the switch fabric 210. The switch fabric""s ability to buffer or queue data packets is limited however. If too many internal data packets need to be queued and any of the limited switch fabric buffers become full, additional data packets that cannot be queued are deleted.
To avoid overflowing a given buffer in the switch fabric, the amount of data arriving at the associated contention point must not exceed the amount of data leaving the contention point by an amount greater than the buffer size when measured over all time intervals.
The rate of a data flow is defined as the amount of data sent divided by the time interval in which this data was sent. We use the term xe2x80x9csteady ratexe2x80x9d for a data flow that produces approximately the same rate when measured over any time interval, both short and long. A xe2x80x9cbursty ratexe2x80x9d is one where the rate of the data flow may vary significantly depending on whether the time interval is long or short. For a contention point with a specific buffer size, a buffer overflow can be avoided if each input port 202, 204, 206 sends packets to the contention point at a steady rate such that the sum of the packet input rates is equal to or less than the packet output rate from the contention point. This must be true for each of the system""s contention points. If an input port 202, 204, 206 is assigned a rate at which it may send data packets to a contention point so as to avoid buffer overflow, it should send data packets at a steady pace that is at or below the given rate. If an input port 202, 204, 206 sends packets to a contention point in a bursty fashion, (i.e. it sends many packets to the point in a short period of time) then the instantaneous data packet rate is significantly greater than the average rate and switch fabric buffers might overflow. It is therefore important that an input port 202, 204, 206 send data packets to each of the contention points at steady rates by evenly spacing out packets sent to each contention point as much as possible. Deciding when any given input port should send a packet to any given contention point, a process known a scheduling, is vital to the performance of a switch.
An improved methodology for scheduling data packets to be sent into the switch fabric in a computationally efficient manner so as to reduce or eliminate the probability of buffer overflow at switch fabric contention points would be an improvement over the prior art.
In many packet switching systems, data is moved internally from input ports to output ports in fixed length internal data packets. Since all internal packets are the same length, it takes each input port the same amount of time to send any internal packet into the switch fabric. Time, therefore, is broken into xe2x80x9ctime slots,xe2x80x9d (or just xe2x80x9cslotsxe2x80x9d) with each time slot being the time it takes to send an internal packet from any input port into the switch fabric. In any time slot, each of the input ports may send an internal packet into the switch fabric. If an input port does not send an internal packet in a given time slot (perhaps because it has no internal packets to send or it does not have permission to send an internal packet it has) it must wait until the next time slot time to send an internal packet.
Time slots may be grouped into larger units of time known as xe2x80x9cframes.xe2x80x9d Each frame consists of xe2x80x9cTxe2x80x9d time slots. Packet switching systems frequently use buffers in the switch fabric or operatively coupled to the switch fabric to temporarily store internal packets as they make their way from the input ports to the output ports through the switch fabric. Because the data buffers holding internal packets in the switch fabric can be overrun if data arrives too fast, scheduling internal data packets into the switch fabric so that internal packets do not overrun a buffer can become critically important. The data buffering requirements of the switch fabric can be affected by how data is scheduled into the switch. By appropriately scheduling data into the switch, buffering requirements can be reduced. Data loss can also be reduced or even eliminated.
There is provided herein, a computationally simple and efficient method for scheduling data out of an input data buffer and into a switch fabric. When used in connection with a packet switching system, switching fabric buffer overruns and the associated data loss are reduced.
Using random access memory for example, an input data structure known as an xe2x80x9cassignment listxe2x80x9d is created in each input port. The data structure stores pointers, (known as vectors, ensemble identifiers, or ensemble IDs). Each ensemble ID is associated with a group of traffic that passes through the input port. More formally, an xe2x80x9censemblexe2x80x9d consists of all of the packets or cells passing through an input port that share some important, predetermined attributes defined by the switch operator or some other entity. Each ensemble ID is associated with an ensemble. Each ensemble ID in the assignment list represents the permission to send into the switch fabric a single fixed-length internal data packet from the ensemble associated with the ensemble ID. It is the responsibility of a local scheduler to decide which specific internal packet of an ensemble to send into the switch fabric when that ensemble has permission to forward an internal packet.
One of the attributes that defines an ensemble is that all of the internal packets of an ensemble must pass through a specific contention point in the switch fabric. Additional attributes may further define an ensemble. Other attributes include, but are not limited to, quality of service class of the data, final network destination of the data, point of origin of the data, and type of application sending or receiving the data. It is possible for an internal packet to belong to two or more ensembles but every internal packet should belong to at least one ensemble.
In our embodiment we only consider the contention point attribute. Specifically, the contention points of interest are the links from the switch fabric to the output ports. Thus, each ensemble is defined by the output port through which all of the ensemble""s traffic will pass. The result is that internal packets are classified and scheduled based on the switch""s output ports.
A global scheduler or other controller for the switching system, assigns or designates for each input port the right to send certain numbers of fixed-length internal data packets from the various ensembles during a frame. A local scheduler for each input port creates a list of the ensemble IDs (the assignment list). The ensemble IDs identify the ensembles from which internal packets can be sent into the switch fabric. The length of the assignment list is equal to T, the number of time slots in a frame. The number of ensemble IDs in the assignment list is equal to the number of internal packets the input port has permission to send in the up-coming frame. If the input port has permission to send less than T internal packets in the upcoming frame then some of the assignment list entry locations will not contain ensemble IDs. If the input port has permission to send more than T internal packets in the upcoming frame, then some of the assignment list entry locations may contain more than one ensemble ID.
After the global scheduler indicates how many packets from each ensemble may be sent in the upcoming frame, the local scheduler places the permitted number of ensemble IDs for a given ensemble consecutively in the first open entry locations in the assignment list according to some ordering of the ensembles. For example, consider the case where the destination output port is the only attribute of the ensembles. If an input port has been granted permission to send four internal packets to output A and two internal packets to output B in an upcoming frame, then the input port creates four xe2x80x9coutput Axe2x80x9d ensemble IDs and two xe2x80x9coutput Bxe2x80x9d ensemble IDs. The four xe2x80x9coutput Axe2x80x9d ensemble IDs are placed in the first four entry locations in the assignment list, numbered 0 through 3. The two xe2x80x9coutput Bxe2x80x9d ensemble IDs are placed in the next two slots, numbered 4 and 5.
The input port""s opportunities to transmit internal packets into the switch fabric are enumerated sequentially in binary form using a fixed number of bits starting from zero. To determine which ensemble an input port should send a packet from in a given transmission slot, the bits of the binary number of the transmission slot are reversed such that the least significant bit becomes the most significant bit, the second least significant bit becomes the second most significant bit, and so on. This new number is then used as an index to the assignment list and the ensemble IDs in this entry indicate the ensemble of the internal packet to be sent. The local scheduler then chooses a packet from the ensemble and sends it into the switch fabric during the transmission slot.