The role of network design in providing Quality of Service (QoS) is often overlooked: a poorly designed network architecture will never, regardless of the sophistication of the bandwidth allocation strategy used, be able to match the performance of a well designed network. Below is shown a type of network that overcomes many of the problems associated with conventional network designs. These networks have bounded hop counts, relatively few links and an even distribution of routes across the network. Even route distribution allows path selection (i.e. routing) algorithms to evenly load traffic onto the network, preventing network hotspots that degrade performance; an even load distribution improves the networks response to failure. Consequently, these regular partially-meshed networks have exceptional performance. Firstly, existing strategies for designing networks, and their problems, are analysed.
Problems of Network Design
Communications networks are very complex systems. They consist of many physical and logical layers, built from many different technologies. A network can be viewed as wholly physical (e.g. a Synchronous Digital Hierarchy (SDH) transport network built of duct, fibre and switching equipment), part physical, part logical (e.g. a set of Asynchronous Transfer Mode (ATM) or Internet Protocol (IP) switches built from logical links provided by a transmission/transport network) or wholly logical (e.g. the logical nodes and links of an ATM Private Network-Network Interface (PNNI) hierarchy). In this many-layered structure, each layer will need a different network design, (which should ideally take into account higher and lower level layers), to account for differing design objectives, constraints and technological limitations. The use of various transmission arrangements is possible when carrying a point-to-point topological link. For example such a link can be the whole or just part of a transmission system, several transmission systems in series, or a combination of such arrangements. A point-to-point topological link may also itself carry many circuits or packet streams which are multiplexed together.
In any real network traffic must enter and exit the network at various topological nodes. This should be assumed to apply for the various configurations, although it may not be specifically mentioned.
There may be cases where some constraint imposes that a combination of topological structures may be combined to form a larger network, in which case only part of an overall network may consist of a regular partially-meshed network. Similarly, an overall network may contain more than one instance of a regular partially-meshed network.
Concentration is on the topological aspects of network design and not focussed on any particular implementation technology, though it is anticipated that the designs will find more application at the service layers of a network (IP/ATM/Public Switched Telephone Network (PSTN)) rather than at the transport layer. The network is therefore designed as a topological arrangement of nodes and links that provides connectivity between all pairs of nodes.
Ring and mesh networks are a good starting point for a discussion of network design, since they illustrate the trade-offs inherent in all networks. Suppose a network is required to carry a constant amount of traffic C between all pairs N of nodes. If every node is connected to every other node, the result is a fully-meshed network as shown in FIG. 1a, which has of order N2 links (i.e. N(N−1)/2 links). Each of the N(N−1) streams of traffic is switched twice, once at the originating node, and once at the destination node. Each stream of traffic has one hop between source and destination (where a hop is defined as the number of links traversed). An alternative is to connect all nodes into a ring using N links as shown in FIG. 1b. Traffic between non-adjacent nodes has to be switched by intervening nodes and the average number of hops per traffic stream is proportional to N. More traffic carrying capacity has to be provided by the nodes and links to carry this transit traffic. The total transit traffic in a ring network scales as N3, for N even, the transit traffic is CN(N/2−1)2, for N odd it is CN((N+1)/2−1)((N+1)/2−2)
Which of these two is better? The fully-meshed network makes efficient use of its nodes: no capacity is used switching transit traffic. However, too many links are used to achieve this objective; the number of links grows so fast with N, that meshes are only practical for small networks. If network ports are a limited resource, then the maximum number of nodes in a network (and hence its capacity) is limited by the number of ports. Is the ring network any better? In a ring network, network size is not limited by the number of ports, but, as the size of the network increases, more and more capacity has to be devoted to carrying transit traffic. In both cases the designs do not scale well: a ring has too many hops; a filly-meshed network has too many links.
Networks with better scaling properties can be designed by adding links selectively between nodes to form a ‘Random Partial Mesh’ as shown in FIG. 2 with nine nodes. Typically each node will be connected to at least two other nodes to ensure the network survives a link failure, and it is ensured that no single node failure can split the network into two pieces. As we shall see, a random partial-mesh represents, at least in some respects, a good compromise between ring and mesh networks. In particular performance and cost can be traded by altering the degree of meshing.
Designing networks from scratch to meet specific objectives, such as cost and performance is intrinsically difficult: the optimisation problem is NP-complete. That is, no algorithm exists to find an optimum solution in polynomial time. Many techniques, such as simulated annealing or genetic algorithms, can be used to find sub-optimal, but useful, solutions, and some design tools incorporate these algorithms or other heuristics. However, the major problem with these approaches is that the quality of the design can only ever be as good as the optimisation criteria used. Choosing practical optimisation criteria is in itself a difficult problem. Some constraints may find expression in a cost function, but adequate solutions may not be found because of inadequacies in the search algorithms.
Analysing the performance of a randomly-meshed network is difficult due to its irregularity. It is a problem best tackled by a computer. Analysis of the routes in a randomly-meshed network (which routes are a function of the topology of the network) almost always shows that some nodes act as ‘hubs’, concentrating many short routes (of length two or three hops). In consequence, the hub nodes and surrounding links will, when loaded using almost any routing protocol, become more highly utilised, since every routing protocol will utilise shorter routes first. In other words the network will develop hot spots, not as a result of asymmetric traffic, but because of the network topology. This need not represent a problem, provided the network is dimensioned to support the asymmetric traffic. However, the consequences of node or link failure can be serious, particularly when a hub node or attached link fails.
Regular Partially-Meshed Networks
Scalability and hub problems can be overcome if partially meshed, regular networks (Regular Partial-Meshes) can be found. If the network is regular, the network looks the same from every node, hence no node can act as a hub. The scalability problem can be solved (or controlled, at least) by ensuring that there is a full set of two-hop routes between all nodes. Consider a partially connected network of N nodes, which contains N switching elements (which need not be the nodes themselves). If each node is connected to of order N1/2 switching elements, then N1/2 different destinations can be reached in one hop. If the switching element is itself connected to N1/2 other nodes, then all N3/2 nodes can be reached in two hops. The network therefore has a total of order N3/2 links, far fewer than the N2 links of the fully connected network. (If N=100, N3/2=1000, N2=10000). Since all traffic is switched one extra time (a total now of three times), only 50% extra switching capacity needs to be deployed.
Finding partially connected networks is not an easy task. Mathematically it consists of finding a connectivity matrix with specific properties. Let A denote a νb connectivity matrix that enumerates the connections between ν network nodes and b other switching elements. The component aij of matrix A is the number of links between the i th node and the j th switching element. Matrix A therefore describes the number of one-hop routes between each pair of nodes and switching elements. The number of routes with two hops between node i and node j is given by the number of routes between i and an intermediate switching element k, aik, times the number of routes from the switching element k to node j, akj, summed over all intermediate elements. We denote this matrix B and its elements bkj. That is
      b    ij    =            ∑      k        ⁢                  a        ik            ⁢                        a          kj                .            
This is just the product of matrix A with itself, so that B=AAT. The two-hop property we want to enforce, along with regularity, is:
                    B        =                              A            ⁢                                                  ⁢                          A              T                                =                                    (                                                                    r                                                        λ                                                        ·                                                        ·                                                        λ                                                                                        λ                                                        r                                                                                                                                                                                                                                                            ·                                                                                        ·                                                                                                                                                          ·                                                                                                                                                          ·                                                                                        ·                                                                                                                                                                                                                                                            ·                                                        ·                                                                                        λ                                                        ·                                                        ·                                                        ·                                                        r                                                              )                        =                                                            (                                      r                    -                    λ                                    )                                ⁢                I                            +                              λ                ⁢                                                                  ⁢                J                                                                        (        1        )            where I is the ν×ν identity matrix, and J is a ν×ν matrix of ones. Equation (1) is a particular representation of a Balanced Incomplete Block Design (BIBD), if there are exactly k ones in each column of A (i.e. each switching element is connected to exactly k nodes) [1,2]. BIBD's were first used in statistical experiment design by Ronald Fisher in the 1920's. They have since found many other uses, in tournament design, coding and cryptography, but not, until now, network design. In the combinatorics literature, BIBD's are often denoted BIBD(ν,b,r,k), or since b and r can be determined from other parameters, as 2−(ν,kλ) designs, or as simply (ν,kλ) designs.
There are many ways of applying BIBD's in communication networks. To make the network more resilient, λ>1, so that there is more than one choice of route between any two nodes. Each of these λ choices is diverse—the routes share no common node or link—since each route must traverse a different switching element. If one route becomes unavailable, for any reason, there is always at least one other route that can be used.
Two Tier Applications: Stars and Areas
The most general application of BIBD's is to consider the b switching elements as a separate switching layer. These switches do not constitute a conventional trunk or higher tier network, as they are not directly connected together. The terms ‘Star node’ and ‘Area node’ are used to distinguish the b switching elements that transit traffic from the ν nodes that sink and source traffic. All BIBD's can be used for this type of application, since it imposes no symmetry constraints on the BIBD. In particular, ν is not equal to b, and the matrix A does not have to be symmetric, i.e. aij is not equal to aji.
An example is an asymmetric (7,4,2) design with ν=b. The matrix A describing the connectivity between area nodes 1-7 and star nodes A-G is shown in FIG. 3, together with a sketch of the network.
There are many extensions of this concept: area nodes can be generalised to include many separate edge nodes using a common set of star nodes, etc, etc. Examples of this type of application are discussed in reference [3].
Single Tier Applications
Using BIBD's in a single tier network, the connectivity matrix A must be symmetric, and ν=b. A symmetric matrix should not be confused with a symmetric BIBD, which has a different mathematical definition. The diagonal of A must be zero otherwise nodes would contain non-trivial links to themselves. These constraints limit the number of suitable BIBD's. Taking a pragmatic view, one can allow many imperfections and still produce high quality networks. In particular, if the maxim is adopted that there are at least two choices of one or two hop routes between every pair of nodesaij+bij>1,many good network designs that are not BIBD can be found. Of the many classes that have been found that fulfil this criteria, two are especially useful (i) symmetric BIBD's with the diagonal removed (this makes the network slightly irregular, but does not decrease network performance appreciably) and (ii) Strongly Regular Graphs. A Strongly Regular Graph with parameters (ν,k,λ,μ) has ν nodes without loops (i.e. links to itself—the diagonal is zero) or multiple links between nodes, and has k links to other nodes. The matrix B giving the all important two hop routes isB=kI+μA+λ(J−I−A)  (2)where I is a ν×ν identity matrix, and J is a ν×ν matrix of ones, as before. Equation (2) says that there are λ choices of two hop routes almost everywhere except where there are direct links, where there are μ choices of two hop routes (as well as one direct route). The amount of extra connectivity is small, since of the ν2 total two-hop routes, only νk of them will have more than two choices, and k is proportional to ν1/2.
FIG. 4 shows a (9,4,1,2) Strongly Regular Graph, denoted 32 in [2]. The structure of the (9,4,1,2) graph, consisting of 3 groups of nodes, each group consisting of a full-mesh of 3 nodes, each node being additionally connected to its partner in the other two groups, suggests many obvious extensions. Repeating the pattern with groups of four nodes gives the Strongly Regular Graph (16,6,2,2), which is also a (16,6,2) BIBD. Likewise the pattern can be extended ad infinitum, with 5 blocks of 5 nodes, etc, all of which are Strongly Regular Graphs. Forming patterns from, 6 groups of 5 nodes, etc, yields graphs which are not strongly regular, but nevertheless form excellent communications networks with diverse mostly two choice two-hop routes.
Performance Comparisons
Since no one performance metric can characterise a network, the relative performance of a set of networks have been compared across a range of measures. Since comparison with rings and meshes would reveal nothing new, randomly-meshed networks with 9,16,25 and 36 nodes were compared to partially meshed networks with the same numbers of nodes [4]. The partially-meshed networks chosen were the 32, 42, 52 and 62 Strongly Regular Graphs from reference [2]. Randomly meshed networks were generated using a commercial simulated-annealing network planning tool set up to use minimum hop routing on an even traffic distribution [5]. All networks were designed so that (at least) two diverse paths exist for every traffic demand. The number of links was controlled by penalising transit traffic. With no transit traffic penalty, the networks become ring like; upon increasing the transit traffic penalty the networks become more mesh-like.
When comparing any two networks, to reach any definite conclusions, assumptions must be made about traffic distribution and network behaviour (i.e. what sort of routing protocols are used). To make the conclusions as general as possible, it is assumed that traffic is routed using a minimum hop scheme, with routes of equal weight being equally utilised. This corresponds to OSPF equal-cost multi-path routing with link weights set to one, and is representative of many commonly used routing schemes [6]. The traffic was assumed to be uniformly distributed, with an arbitrary one unit of traffic demand between all node pairs. This simplifies analysis, but is also, when designing networks, the least biased traffic distribution to choose in the absence of any information about traffic distribution. The choice of traffic distribution does not affect the primary topological design issues considered.
Transit Traffic and its Distribution
Transit traffic, is a simple measure of network efficiency. For a uniform traffic distribution, the transit traffic distribution is representative of the route distribution across the network. FIGS. 5a and 5b show the transit traffic on each node and the traffic on each link for the randomly meshed and partially connected 9 node, 18 link networks respectively.
In the randomly meshed network, two nodes act as hubs, carrying far more transit traffic than other nodes (14 and 10 units respectively).
FIG. 6 shows the mean transit traffic per node, and the sum of mean transit traffic and its standard deviation, plotted as a function of the number of links for all networks. The mean transit traffic is comparable in randomly meshed and partially connected networks when the numbers of links is equal. This shows that the randomly meshed networks are as good as the partially meshed, using this measure of efficiency. However, the standard deviation of the transit traffic in the randomly meshed networks shows the traffic is unevenly distributed. For the regular networks, the standard deviation is zero; the traffic is perfectly distributed. Uneven traffic distribution is not a problem for a functioning network as individual links and nodes can be dimensioned accordingly, but it causes problems when links or nodes fail, since large amounts of transit traffic will need re-routing.
Node and Link Failure
In any network, nodes and links can always fail. The ability of a network to function with failed links or nodes is a vital component of its design. Consider the failure of the busiest link (shown dotted) in the nine node networks shown in FIG. 7. Traffic has been re-routed around the failure and the new node and link occupancies calculated. FIG. 7 shows them expressed as a multiplier of the load in the unfailed state.
Less spare capacity needs to be provided in the regular network to carry traffic in the failed state, as much less traffic needs re-routing, since the transit traffic is uniformly distributed. Plotted in FIG. 8 is the worst case increase in capacity of nodes and links that ensues when all nodes or links are failed, one at a time.
FIG. 8 can be used to determine the worst case planning limit that should be used on nodes and links to ensure no network congestion in failure. Some links in the randomly meshed networks can only be run at a maximum of 30% occupancy, whereas in the regular networks, this figure varies from 75% in the 9 node network, to 83% in the 36 node network. In the randomly meshed networks, planning limits must be determined for every node and link; for the regular mesh networks, they are constant across all nodes and links. FIG. 9 shows the total node and link capacity is required to support the given traffic load and survive all possible single point of failure scenarios as a fraction of the working node and link capacity. Actual deployed capacity is lower for the regular mesh networks with comparable numbers of links, and even for some networks with fewer links: making networks too sparse can often be a false economy.
Load Balancing and Uneven Traffic Distributions
Two objections might be raised to the analysis presented above: that the assumption of a uniform traffic distribution invalidates the results and that load balancing algorithms can mitigate the effects of poor network design.
The key advantage of the regular partially connected network designs is that the one and two hop routes upon which traffic is routed (and which most routing algorithms would utilise first) are evenly distributed across the network. Therefore a regular network will, almost regardless of the route selection algorithm, spread any traffic distribution as evenly as is possible across the network. Uneven transit traffic distribution (and the problem of hub nodes) is a function of network topology, not traffic distribution.
Load-balancing algorithms can improve the traffic distribution in the network. Using a simple load -balancing algorithm (to break ties on equal length routes, find the resource with the highest utilisation, choose the route with the lowest of the two) for both node and links showed that traffic could be balanced across the most highly utilised nodes or links. In effect, the two biggest hubs in the network were balanced. This is not surprising as many equal-length alternative routes traverse both hubs: these are the nodes or links that get balanced. For the regular networks, since node or link occupancies are more equal, node or link—balancing algorithms tend to balance load across all node or links. Load-balancing algorithms can improve network balance, but tend to work better on regular networks. Algorithms that try to guarantee quality of service (QoS) would also tend to work better on a regular network. QoS algorithms typically select a short path, check that the required service can be supported, and route the traffic accordingly. Only if the requested QoS cannot be guaranteed on this path (say due to resource exhaustion) will an alternative (longer) path be selected. This longer path will consume more network resources than a shorter path. If the point at which QoS algorithms choose longer paths can be delayed, by loading traffic more evenly on the network, then the final capacity of the network to route traffic with a given QoS will be higher.
Network design always involves a compromise between cost and performance; between rings and meshes. Among the near-infinite choices between these extremes are mathematically perfect or near-perfect regular partial mesh networks derived from BIBD's and strongly regular graphs. Such networks have natural traffic-balancing properties that make them preferable to random meshes of similar connectivity. It is advocated using these designs for their efficiency, robustness and regularity.
Following the analysis above, communications networks based on the suggested Balanced Incomplete Block Designs (BIBD's) and similar incidence matrices have many properties that make them especially suited for use in communications networks. The particular properties are:                1. All nodes are connected by routes of length maximum. 2        2. Multiple routes are provided that enhance load balancing and redundancy.Networks and Topology        
The topology of an arbitrary communication network can be represented as a Graph, which is an arrangement of Nodes (or Vertices) connected by Links (or Edges). A Node can represent a switching or routing element, or a logical aggregation of such elements. Links provide point-to-point connections between Nodes, and can represent physical connectivity (e.g. a fibre optic transmission system), logical connectivity (a virtual circuit, for instance) or a logical aggregate of such.
The topology of a network can also be represented by a connectivity matrix. If the Nodes in a network are labelled 1 . . . N, the connectivity matrix is an ordered array of numbers with N rows and N columns, with the entry in the ith row and jth column representing the number of Links between the ith and jth Node. We denote the entire matrix by C, and the entry in row i column j by cij. An example is shown below.
  C  =      (                            0                          1                          0                          0                          1                                      1                          0                          1                          0                          0                                      0                          1                          0                          1                          1                                      0                          0                          0                          0                          0                                      1                          0                          1                          1                          0                      )  
The transpose of matrix C is denoted CT and is defined by interchanging row and column indices:cTij=cji.
Multiplication of two matrices C and D to give E is denoted by:E=CDand defined by the following operations on the components of each matrix:
      e    ij    =                    ∑        k            ⁢                        c          ik                ⁢                  d          kj                      =                  ∑        k            ⁢                        c          ik                ⁢                  d          jk          T                    
A Route across the network consists of a set of Nodes and Links traversed in order. The Length of the Route is defined to be the number of Links traversed. The number of Routes of Length 1 between Nodes i and j is given by the entry cij in the connectivity matrix. The number of Routes of Length 2 between Nodes i and j is given by the number of Routes from Node i to any intermediate Node k multiplied by the number of routes from Node k to Node j. This set of numbers can be written as a matrix, which is called B.
      b    ij    =            ∑      k        ⁢                  c        ik            ⁢              c        kj            andB=CCTBalanced Incomplete Block Designs
A Balanced Incomplete Block Design (BIBD) is a concept that originates in combinatorial analysis. BIBD's solve the problem of arranging objects into a given number of sets under a certain set of restrictions. A formal description, taken from “Combinatorial Theory”, Marshall Hall, (Blaisdell: Waltham Mass. 1967) is:                A balanced incomplete block design is an arrangement of ν distinct objects into b blocks such that each block contains exactly k distinct objects, each object occurs in exactly r different blocks, and every pair of distinct objects ai, aj occurs together in exactly λ blocks.        
A balanced incomplete block design can also be described by an incidence matrix. This is a matrix A with ν rows and b columns, where, if a1 , . . . , aν are the objects and B1 , . . . , Bb are the blocks, thenaij=1, if aiεBjaij=0, if ai∉Bj
Then, a balanced incomplete block design will have the following properties—
          ⁢            A      ⁢                          ⁢              A        T              =                  (                                            r                                      λ                                      ·                                      ·                                      ·                                      λ                                                          λ                                      r                                                                                                                                                                                                                                                                                                                                          ·                                                                                                          ·                                                                                                                                                                              ·                                                          ·                                                                                                                                                                              ·                                                                                                          ·                                                          ·                                                                                                                                                                                                                                                  ·                                      ·                                                          λ                                      ·                                      ·                                      ·                                      ·                                      r                                      )            =                                    (                          r              -              λ                        )                    ⁢                      I            v                          +                  λ          ⁢                                          ⁢                      J            v                    ⁢                      I            v                              where Iν is the ν times ν identity matrix, and Jν is a ν times ν matrix of ones. An additional constraint is there must be exactly k ones in each column of A.Imperfect BIBD's
An Imperfect BIBD or Imperfect Network is defined as a BIBD wherein at least one Topological Node has a missing or extra Topological Link.
BIBD's and Networks
The incidence matrix A of a block design can be used to connect some or all of the Nodes in a Network. The key property is that a particular subset of connected Nodes have λ Routes of Length at maximum 2 between all distinct Nodes of that subset. The connected Nodes in this subset will have the following properties:    1. Connectivity: All Nodes are connected by λ Routes of Length 2. (They may also be connected by Routes of Length 1, and many longer Routes.)    2. Balancing: If λ>1, traffic may be balanced across the A different Routes available.    3. Resilience: If a Node or Link on a Route fails, λ−1 equivalent Routes can be used to carry the traffic.
It is the balancing and resilience properties, together with short Routes, that make these connectivity patterns so useful as networks.