(1) Field of the Invention
This invention relates to a communications system and a method for communicating, and more particularly (although not exclusively) to a system and method of the kind which can operate over network links referred to as limited links and characterised by low bandwidth or high latency or both.
(2) Description of the Art
Communications systems operating over multiple network links are required to select routes to allow transmission and reception of messages in communications with other systems. These links include satellite links and Integrated Services Digital Network (ISDN). For message routing purposes, a communications system incorporates a control computer referred to as a router. A router is used to control routing of messages between communications systems by matching the destination of a message against a routing table which it maintains by exchanging routing messages with neighbouring routers to update the table. The method of transmission uses protocols referred to as Transmission Control Protocol (TCP) and Internet Protocol (IP) which together form the TCP/IP suite. For routing purposes, a router may be considered to be synonymous with the communications system of which it is a part.
Communications protocols are conceptually built up in layers or levels defined by software, with services offered by one (lower) layer being used by a subsequent layer above it to implement a richer protocol. A layer model known as the OSI 7-layer model consists of the following:                7. Application Layer—Network process to application;        6. Presentation Layer—Data representation and encryption;        5. Session Layer—Interhost communication;        4. Transport Layer—End-to-End connectivity (TCP);        3. Network Layer—Path determination (IP);        2. Data Link Layer or Link Layer—Physical addressing; and        1. Physical Layer—media, signal and binary transmission.        
The above seven layers may be treated as merged in groups to form three layers: i.e. layers 7, 6 and 5 may be treated as a combined application layer, layers 3 and 4 as a network layer, and layers 1 and 2 as a link layer.
Message packet sending may be unicast, multicast or broadcast: unicast means sending a packet to an individual IP address; multicast means sending a packet to a group or plurality of addresses, i.e. to some but not to all communications systems connected to a network; broadcast means sending a packet to a broadcast address of a network, so that all communications systems connected to the network receive the packet.
Communications message traffic may be routed over wireless links by routers using computer software referred to as OSPF which implements a known algorithm called “Open Shortest Path First”. Other examples of routing software are also known, called BGP and RIP. An open source implementation of such routing software is called “Quagga Routing Suite” or “Quagga”. Quagga contains executable implementations of OSPF, BGP and RIP routing software with a controlling system called Zebra, each of which runs as a daemon, i.e. in the background with no user interaction required.
U.S. Pat. No. 5,412,654 discloses a data communications system which operates at the data link layer or link layer. It selects routes on the basis of the fewest number of hops to a destination, a hop being defined as communication between an adjacent pair of communications systems linked by a communications path having no other intervening communications system in it. This means that a hop involving a high-speed, reliable link is treated as equivalent to a less reliable limited link with low bandwidth and/or high latency, whereas two hops over high-speed reliable links may be preferable to one hop over a slow limited link. U.S. Pat. No. 5,412,654 also discloses use of broadcast packet sending, which is not useful over Point-to-Point Protocol links, therefore it requires both ends of a communication route of one or more hops to be within the same network. There is no straightforward way to link directly two communications systems using different protocols both operating at the Link Layer.
RIP routing software uses unicast packet sending, so a packet is sent to an individual remote communications system. Consequently a packet to be received by multiple remote communications systems is required to be sent out of a communications interface as many times as there are intended recipients. Like U.S. Pat. No. 5,412,654, RIP uses fewest number of hops as a basis for selecting best routes and treats all types of links as equivalent, whether limited or not. RIP also provides for a routing table to be requested periodically.
A router which implements OSPF may be referred to as an OSPF router. OSPF is widely supported by commercial-off-the-shelf (COTS) equipment. OSPF (version 2) is the modern interior gateway dynamic routing protocol of choice for IP routing, and is a de facto standard for IP routing via fixed links: here “interior” means a routing protocol which works within an autonomous system, i.e. a group of IP communications system or routers in an area considered as an entity; the converse of interior, i.e. exterior, means a routing protocol which works between different autonomous systems; gateway means a router which serves as an entrance to or exit from an interior network, i.e. it is on the borders of two autonomous systems; “dynamic” means that routing is not fixed but can vary and a fixed link is a link (e.g. a hard wired link) which does not change with environment (e.g. weather affecting a wireless link). A router maintains and updates a table of information (routing table) listing addressing details of other routers linked in a communications network and available for receiving message traffic: addressing details of routers may be added to or deleted from a routing table as they become or cease to be available for communication. If there is good connectivity over network links between routers then OSPF traffic will keep routing tables up to date.
In setting up routing of message traffic between OSPF routers in a communications network, two of the routers are automatically elected to be in charge of distributing topology information on a link by link basis regarding the routers linked together in the network: i.e. individual links are between respective pairs of routers. The elected routers are referred to as the designated router (DR) and the backup designated router (BDR). All routers exchange their routing information with the two elected routers, which then disseminate it to the other adjacent routers in the area: here two routers are “adjacent” to one another if they are connected via a single link so that a message can pass between them in a single hop. This is an optimisation built into the OSPF protocol: it substantially reduces the amount of routing message traffic necessary when compared to a full-mesh equivalent, i.e. every router exchanging information with every other router. OSPF is designed with high-speed, reliable link-layers in mind, notably Ethernet. It is, however, a large and complex protocol and places significant demands on the fixed link layers it traverses.
OSPF Supports:
                (a) dynamic discovery of peers using multicast messages: a peer is another communications system with which linking might be required, represented by its router, and a multicast message is a message with unrestricted recipients;        (b) dividing large networks into smaller “areas” of routers: an area is connected to another area in the large network via a “backbone” area;        (c) link state routing with all routers in an area being informed: here link state means availability of links—routers need information on which links can be used for message traffic;        (d) aggregation of routes into summaries at area or subnet borders: aggregation relates to routes sharing a common link, i.e. one router being used as a stepping stone to other routers in an area so that multiple destinations in an area can be reached via a single link (e.g. a backbone) to a router;        (e) rapid convergence via the OSPF shortest path algorithm: convergence means establishment of consistent routing tables between adjacent routers;        (f) assignment of arbitrary route metrics to links unidirectionally, i.e. for a single pass along a link: a route metric is a number assigned to a link and expressing the desirability or otherwise of using the link;        (g) well-defined interfaces with exterior routing algorithms, so that for example different autonomous systems can be connected together and an intranet can communicate with external networks such as the internet;        (h) classless routing with variable length subnet masks: here class relates to one of three historical classes of network size;        (i) election of designated routers to reduce traffic on a shared subnet; and        (j) robust (i.e. fault tolerant) operation when data packets are lost from a message;        
However, OSPF functions less well with communications links of the kind referred to herein as “limited links” and defined as having low-bandwidth or high latency or both: e.g. a link with a bandwidth in the range of 2 Kbits/s (typical of HF radio) to 40Kbits/s (typical of the UHF SNR system) is a low-bandwidth link; and a link for which the time taken for a message to be sent and a response received (round trip time) exceeds 2 seconds is a high latency link.
Routing instability can result from the designated router election protocol in OSPF because of intermittent connectivity, i.e. when routers join or leave a network and consequently network links not being constant. A router will join (become a new member of) the network by sending a message referred to as a ‘hello’ message. The hello message needs to be acknowledged in handshaking fashion before the new member can join the network, which consumes available bandwidth and can be a problem for a limited link. The sudden appearance of a ‘hello’ message can cause the new member to be elected as the designated router (DR) or the backup designated router (BDR). Upon such an election, existing OSPF peerings (pairings between the previous DR, BDR and adjacent routers) on the network are dropped, and new adjacencies are formed between the newly elected DR and/or BDR and other network routers by exchanging routing databases or tables: the exchange causes a temporary routing interruption. If the new member of the network, i.e. the newly elected DR or BDR, has poor (e.g. intermittent) connectivity to the rest of the network, this can cause a total loss of service which will partition the network and may take several minutes to resolve. It will interrupt connectivity between all members of the network, even those with good connectivity to members other than the new member.
Serious challenges for the useful operation of OSPF are posed by a network which uses limited links, i.e. low bandwidth and/or high latency links as aforesaid: this also applies to any situation where TCP/IP message traffic does not perform as expected. Across a limited link OSPF can take a very long time to synchronise, i.e. to produce consistent routing tables between neighbouring routers. The number of message send-receive round-trips required to establish a link coupled with high latencies and relatively small time-out values built into the OSPF protocol tend to cause delays in setting up a network and instability once the network is formed: here a time-out value is a predetermined time delay during which a message is awaited but not received. Over a limited link, the handshaking hello procedure in the OSPF protocol employed to establish network links either fails or requires an excessive number of retries caused by message packets being lost or seriously delayed. Long delays in synchronisation and long-term instability are serious failings: they result in a communications service which is in theory up and running with capacity available, but which is not actually available to users for sending message traffic in a timely manner.