The present invention relates generally to data communications networks and more particularly relates to a method of distributing network resources fairly among users establishing calls in an ATM network.
Currently, there is a growing trend to make Asynchronous Transfer Mode (ATM) networking technology the base of future global communications. ATM has already been adopted as a standard for broadband communications by the International Telecommunications Union (ITU) and by the ATM Forum, a networking industry consortium.
ATM originated as a telecommunication concept defined by the Comite Consulatif International Telegraphique et Telephonique (CCITT), now known as the ITU, and the American National Standards Institute (ANSI) for carrying user traffic on any User to Network Interface (UNI) and to facilitate multimedia networking between high speed devices at multi-megabit data rates. ATM is a method for transferring network traffic, including voice, video and data, at high speed. Using this connection oriented switched networking technology centered around a switch, a great number of virtual connections can be supported by multiple applications through the same physical connection. The switching technology enables bandwidth to be dedicated for each application, overcoming the problems that exist in a shared media networking technology, like Ethernet, Token Ring and Fiber Distributed Data Interface (FDDI). ATM allows different types of physical layer technology to share the same higher layerxe2x80x94the ATM layer.
ATM uses very short, fixed length packets called cells. The first five bytes, called the header, of each cell contain the information necessary to deliver the cell to its destination. The cell header also provides the network with the ability to implement congestion control and traffic management mechanisms. The fixed length cells offer smaller and more predictable switching delays as cell switching is less complex than variable length packet switching and can be accomplished in hardware for many cells in parallel. The cell format also allows for multi-protocol transmissions. Since ATM is protocol transparent, the various protocols can be transported at the same time. With ATM, phone, fax, video, data and other information can be transported simultaneously.
ATM is a connection oriented transport service. To access the ATM network, a station requests a virtual circuit between itself and other end stations, using the signaling protocol to the ATM switch. ATM provides the User Network Interface (UNI) which is typically used to interconnect an ATM user with an ATM switch that is managed as part of the same network.
The current standard solution for routing in a private ATM network is described in Private Network Node Interface (PNNI) Phase 0 and Phase 1 specifications published by the ATM Forum. The previous Phase 0 draft specification is referred to as Interim Inter-Switch Signaling Protocol (IISP). The goal of the PNNI specifications is to provide customers of ATM network equipment some level of multi-vendor interoperability.
As part of the ongoing enhancement to the ATM standard by work within the ATM Forum and other groups, the Private Network to Network Interface (PNNI) protocol Phase 1 has been developed for use between private ATM switches and between groups of private ATM switches. The PNNI specification includes two categories of protocols. The first protocol is defined for the distribution of topology information between switches and clusters of switches where the information is used to compute routing paths within the network. The main feature of the PNNI hierarchy mechanism is its ability to automatically configure itself within the networks in which the address structure reflects the topology. The PNNI topology and routing techniques are based on the well-known link state routing technique.
The second protocol is effective for signaling, i.e., the message flows used to establish point-to-point and point-to-multipoint connections across the ATM network. This protocol is based on the ATM Forum User to Network Interface (UNI) signaling with mechanisms added to support source routing, crankback and alternate routing of source SETUP requests in the case of bad connections.
With reference to the PNNI Phase 1 specifications, the PNNI hierarchy begins at the lowest level where the lowest level nodes are organized into peer groups. A logical node in the context of the lowest hierarchy level is the lowest level node. A logical node is typically denoted as simply a node. A peer group is a collection of logical nodes wherein each node within the group exchanges information with the other members of the group such that all members maintain an identical view of the group. When a logical link becomes operational, the nodes attached to it initiate and exchange information via a well known Virtual Channel Connection (VCC) used as a PNNI Routing Control Channel (RCC).
Hello messages are sent periodically by each node on this link. In this fashion the Hello protocol makes the two neighboring nodes known to each other. Each node exchanges Hello packets with its immediate neighbors to determine the local state information of its neighbor. The state information comprises the identity and peer group membership of the immediate neighbors of the node including a status of all its links to its neighbors. Each node then bundles its state information in one or more PNNI Topology State Elements (PTSEs) which are subsequently flooded throughout the peer group.
PTSEs are the smallest collection of PNNI routing information that is flooded as a unit among all logical nodes within a peer group. A node topology database consists of a collection of all PTSEs received, which represents the present view of the PNNI routing topology of a particular node. In particular, the topology database provides all the information required to compute a route from the given source node to any destination address reachable in or through that routing domain.
When neighboring nodes at either end of a logical link begin initializing through the exchange of Hellos, they may conclude that they are in the same peer group. If it is concluded that they are in the same peer group, they proceed to synchronize their topology databases. Database synchronization includes the exchange of information between neighboring nodes resulting in the two nodes having identical topology databases. A topology database includes detailed topology information about the peer group in which the logical node resides in addition to more abstract topology information representing the remainder of the PNNI routing domain.
During a topology database synchronization, the nodes in question first exchange PTSE header information, i.e., they advertise the presence of PTSEs in their respective topology databases. When a node receives PTSE header information that advertises a more recent PTSE version than the one that it has already or advertises a PTSE that it does not yet have, it requests the advertised PTSE and updates its topology database with the subsequently received PTSE. If the newly initialized node connects to a peer group then the ensuing database synchronization reduces to a one way topology database copy. A link is advertised by a PTSE transmission only after the database synchronization between the respective neighboring nodes has successfully completed. In this fashion, the link state parameters are distributed to all topology databases in the peer group.
Flooding is the mechanism used for advertising links whereby PTSEs are reliably propagated node by node throughout a peer group. Flooding ensures that all nodes in a peer group maintain identical topology databases. A short description of the flooding procedure follows. PTSEs are encapsulated within PNNI Topology State Packets (PTSPs) for transmission. When a PTSP is received, its component PTSEs are examined. Each PTSE is acknowledged by encapsulating information from its PTSE header within the acknowledgment packet that is sent back to the sending neighbor.
If the PTSE is new or is more recent then the current copy in the node, the PTSE is installed in the topology database and flooded to all neighboring nodes except the one from which the PTSE was received. A PTSE sent to a neighbor is periodically retransmitted until acknowledged.
Note that flooding is an ongoing activity wherein each node issues PTSPs with PTSEs that contain updated information. The PTSEs contain the topology databases and are subject to aging and get removed after a predefined duration if they are not refreshed by a new incoming PTSE. Only the node that originally originated a particular PTSE can re-originate that PTSE. PTSEs are reissued both periodically and on an event driven basis.
The database exchange process involves exchanging a sequence of database summary packets that contain the identifying information of all PTSEs in a node topology database. The database summary packet performs an exchange utilizing a lock step mechanism whereby one side sends a database summary packet and the other side responds with its own database summary packet, thus acknowledging the received packet.
When a node receives a database summary packet from its neighboring peer, it first examines its topology database for the presence of each PTSE described within the packet. If the particular PTSE is not found in its topology database or if the neighboring peer has a more recent version of the PTSE then the node requests the PTSE from the particular neighboring peer or optionally from another neighboring peer whose database summary indicates that it has the most recent version of the PTSE.
A corresponding neighboring peer data structure is maintained by the nodes located on either side of the link. The neighboring peer data structure includes information required to maintain database synchronization and flooding to neighboring peers.
It is assumed that both nodes on either side of the link begin in the Neighboring Peer Down state. This is the initial state of the neighboring peer for this particular state machine. This state indicates that there are no active links through the neighboring peer. In this state, there are no adjacencies associated with the neighboring peer either. When the link reaches the point in the Hello protocol where both nodes are able to communicate with each other, the event AddPort is triggered in the corresponding neighboring peer state machine. Similarly when a link falls out of communication with both nodes the event DropPort is triggered in the corresponding neighboring peering state machine. The database exchange process commences with the event AddPort which is thus triggered but only after the first link between the two neighboring peers is up. When the DropPort event for the last link between the neighboring peers occurs, the neighboring peer state machine will internally generate the DropPort last event closing all state information for the neighboring peers to be cleared.
It is while in the Negotiating state that the first step is taken in creating an adjacency between two neighboring peer nodes. During this step it is decided which node is the master, which is the slave and it is also in this state that an initial Database Summary (DS) sequence number is decided upon. Once the negotiation has been completed, the Exchanging state is entered. In this state the node describes is topology database to the neighboring peer by sending database summary packets to it.
After the peer processes the database summary packets, the missing or updated PTSEs can then be requested. In the case of logical group nodes, those portions of the topology database that where originated or received at the level of the logical group node or at higher levels is included in the database summary. The PTSP and PTSE header information of each such PTSE is listed in one of the nodes database packets. PTSEs for which new instances are received after the exchanging status have been entered may not be included in any database summary packet since they will be handled by the normal flooding procedures.
The incoming data base summary packet on the receive side is associated with a neighboring peer via the interface over which it was received. Each database summary packet has a database summary sequence number that is implicitly acknowledged. For each PTSE listed, the node looks up the PTSE in its database to see whether it also has an instance of that particular PTSE. If it does not or if the database copy is less recent, then the node either re-originates the newer instance of the PTSE or flushes the PTSE from the routing domain after installing it in the topology database with a remaining lifetime set accordingly.
Alternatively, if the listed PTSE has expired, the PTSP and PTSE header contents in the PTSE summary are accepted as a newer or updated PTSE with empty contents. If the listed PTSE is not found in the topology database in the node, the particular PTSE is put on PTSE request list so it can be requested from a neighboring peer via one or more PTSE request packets.
If the PTSE request list from a node is empty, the database synchronization is considered complete and the node moves to the Full state.
However, if the PTSE request list is not empty then the Loading state is then entered once the last database summary packet has been sent but the PTSE request list is not empty. At this point, the node now knows which PTSE needs to be requested. The PTSE request list contains a list of those PTSEs that need to be obtained in order to synchronize the topology database with the topology database of the neighboring peer. To request these PTSEs, the node sends the PTSE request packet which contains one or more entries from the PTSE request list. The PTSE request list packets are only sent during the Exchanging state and the Loading state. The node can send a PTSE request pack to a neighboring peer and optionally to any other neighboring peers that are also in either the Exchanging state or the Loading state and whose database summary indicates that they have the missing PTSEs.
The received PTSE request packets specify a list of PTSEs that the neighboring peer wishes to receive. For each PTSE specified in the PTSE request packet, its instance is found in its topology database. The requested PTSEs are subsequently bundled into PTSPs and transmitted to the neighboring peer. Once the last PTSE and the PTSE request list has been received, the node moves from the Loading state to the Full state. Once the Full state has been reached, the node has received all PTSEs known to be available from its neighboring peer and links to the neighboring peer can now be advertised within PTSEs.
A major feature of the PNNI specification is the routing algorithm used to determine a path for a call from a source user to a destination user. The routing algorithm of PNNI is a type of link state routing algorithm whereby each node is responsible for meeting its neighbors and learning their identities. Nodes learn about each other via the flooding of PTSEs described hereinabove. Each node computes routes to each destination user using the information received via the PTSEs to form a topology database representing a view of the network.
Using the Hello protocol and related FSM of PNNI neighboring nodes learn about each other by transmitting a special Hello message over the link. This is done on a continual periodic basis. When a node generates a new PTSE, the PTSE is flooded to the other nodes within its peer group. This permits each node to maintain an up to date view of the network.
Once the topology of the network is learned by all the nodes in the network, routes can be calculated from source to destination users. A routing algorithm that is commonly used to determine the optimum route from a source node to a destination node is the Dijkstra algorithm.
The Dijkstra algorithm is used to generate the Designated Transit List which is the routing list used by each node in the path during the setup phase of the call. Used in the algorithm are the topology database (link state database) which includes the PTSEs received from each node, a Path List comprising a list of nodes for which the best path from the source node has been found and a Tentative List comprising a list of nodes that are only possibly the best paths. Once it is determined that a path is in fact the best possible, the node is moved from the Tentative List to the Path List.
The algorithm begins by using the source node (self) as the root of a tree followed by the placement of the source node ID onto the Path List. Next, for each node N that is placed in the Path List, the nearest neighbors of N are examined. For each neighbor M, add the cost of the path from the root to N to the cost of the link from N to M. If M is not already in the Path List or the Tentative List with a better path cost, add M to the Tentative List.
If the Tentative List is empty, terminate the algorithm. Otherwise, find the entry in the Tentative List with the minimum cost. Move that entry to the Path List and repeat the examination step described above.
The ATM PNNI specification provides for a topological hierarchy that can extend up to 10 levels. The hierarchy is built from the lowest upward with the lowest level representing the physical network. A node in the lowest level represents just itself and no other nodes. Nodes in the upper levels, i.e., two through ten, are represented by what are known as logical nodes. A logical node does not exist physically but is an abstraction of a node. A logical node represents an entire peer group but at a higher level in the hierarchy.
A complex node representation is used to represent the aggregation of nodes in a peer group at the level of the logical node. The metrics, attributes and/or parameters (hereinafter referred to simply as metrics) of the links and nodes within the peer group are represented in summarized form. This permits peer groups with large numbers of nodes and links to be represented in a simple fashion.
In actuality one of the physical nodes making up a peer group is given the task of instantiating the logical group node. Normally, the physical node (located in the child peer of the logical group node to be instantiated) assigned this task is the peer group leader (PGL). Thus, the node designated the PGL is required to commit network and computing resources to run the logical group node functions, maintain one or more SVCC-based RCCs, etc. in addition to providing computing resources to run the functions of a normal physical node, i.e., routing, signaling, Hello FSM protocol, etc.
In the majority of networks today, the concept of Client/Server is widely used to provide one or more services to a large number of users. In this scenario, the Server which resides on one or more computers, provides one or more services to clients which may be located anywhere in the network. In most cases, there is no synchronization between the clients and the server with regards to the order in which the connections to the server are established. Therefore, the clients are connected to the server on a first come first serve basis, i.e., the clients that come first are the first to be connected.
In an ATM network using quality of service parameters and optimization metrics/attributes the first clients to establish connections will have better optimized paths. The clients that establish connections later on will have connections that are worse in quality than those of the clients to arrive before them. Over time, no way currently exists to fairly distribute the available routes. As an example, consider delay as the optimization metric. In this case, all the initial clients will have bandwidth allocated to them while the later arriving clients will establish paths that are not as optimized and lower in quality of service. Worse still, the clients that come first, may remain connected for long lengths of time. Depending on the policy in effect for the network, this is an unfair situation whereby clients that arrive early, establish connections with more optimum quality of service then later arriving clients simply because they were there first.
The present invention is a method of distributing network resources fairly in an ATM network. The invention is applicable in situations where a large number of clients establish connections to a single server. The method solves the problem that arises when early arriving clients establish optimum connections and snatch up network resources while leaving later arriving clients with less than optimum connections. The method provides a way to more evenly distribute network resources when large numbers of clients desire connections to a popular server.
A fair percentage value is assigned to a popular server and is advertised in a PTSE. The fair percentage value is associated with a reachable ATM destination address and is flooded throughout the network. When a source node desires to connect to this server, the route is chosen in accordance with the fair percentage value. Rather than always choosing the best route, the source node chooses a route at random from among a percentage of the best routes.
For example, if the fair percentage value is 70%, then the source node chooses a route at random from the best 70% of the routes calculated. In this way, the optimum routes are distributed randomly to clients requesting connections. Now, the first come clients do not necessarily receive the best connections.
The switches can be configured with the fair percentage value either statically via a network manager or dynamically under program control. In the former case, the network manager identifies popular servers and assigns a fair percentage value by hand. In the latter case, the software tracks the call rate to attached servers/hosts and when the rate exceeds a threshold having hysteresis, a fair percentage value is assigned and advertised via PTSE flooding. The hysteresis comprises an upper and lower threshold. When the upper threshold is exceeded, advertising is initiated. When the call rate decreases below a lower threshold, advertising may cease completely or a new fair percentage value may be determined in accordance with the current call rate.
There is provided in accordance with the present invention a method of fairly distributing network resources within a network when establishing calls between a plurality of clients and a server, the method comprising the steps of identifying a server to receive fair resource distribution services when establishing calls from a plurality of clients thereto, assigning, by a destination node connected to the server, a fair percentage value to a reachable address associated with the server, advertising the address along with the fair percentage value to nodes within the network, calculating all possible routes between a source node and a destination node, the destination node corresponding to the reachable address, choosing at random one route from among the best routes calculated, wherein the number of routes used to randomly select from is equal to the fair percentage value of all possible routes and establishing the chosen route.
The network may comprise an Asynchronous Transfer Mode (ATM) network or an Internet Protocol (IP) based network. The step of advertising the address comprises placing the address in a Type, Length, Value (TLV) field within a Private Network to Network Interface (PNNI) Topology State Element (PTSE) and flooding the PTSE using standard PNNI protocol. The fair percentage value is assigned manually and configured into a switch by a network manager.
There is also provided in accordance with the present invention a method of fairly distributing network resources within a network when establishing calls between a plurality of clients and a server, the method comprising the steps of monitoring, by a destination node, the number of calls received per unit time to a particular destination address, determining if the call rate to a particular destination address is within predetermined boundaries, assigning, by a destination node connected to the server, a fair percentage value to the address whose associated call rate is within predetermined boundaries, advertising the address along with the E fair percentage value to nodes within the network, calculating all possible routes between a source node and a destination node, the destination node corresponding to the reachable address, choosing at random one route from among the best routes calculated, wherein the number of routes used to randomly select from is equal to the fair percentage value of all possible routes and establishing the chosen route.
The fair percentage value is assigned dynamically and configured into a switch without intervention by a network manager. The method further comprises the step of ceasing advertising of the address along with the fair percentage value to nodes within the network when the call rate is not within the predetermined boundaries. The method further comprises the step of redetermining the fair percentage value in accordance with the reduced call rate and advertising the address along with the redetermined fair percentage value to nodes within the network when the call rate is not within the predetermined boundaries. In addition, the predetermined boundaries comprises an upper threshold and a lower threshold whereby advertising is initiated when the call rate exceeds the upper threshold, the action of the upper threshold and the lower threshold combine to provide hysteresis.