The OSPF standard, RFC 2328, defines several timers that are important to OSPF behavior as a routing protocol. One of these timers is known as the retransmission timer. This timer is crucial to the reliable flooding of link state advertisements (LSAs) across the network. LSAs contain important information about a node from which they are generated, such as its links. An LSA is flooded hop by hop. Each node that sends an LSA to its neighbor (i.e. another node directly connected to it) awaits an acknowledgment that the LSA has been received. If the LSA is not received within the retransmission timer interval, the LSA is retransmitted. This cycle repeats until the LSA is acknowledged.
When a node comes under heavy load, CPU and memory resources become heavily utilized preventing an LSA from being processed in time, if at all. For instance, when the number of events that a CPU has to process exceeds the processing power of the CPU, a work backlog will be created and events will not be processed as soon as they arrive but will have to wait until the CPU can get to them. In the case of LSAs, if the waiting time exceeds the retransmission interval set at the sender of the LSA, an LSA retransmission will be triggered and the heavily utilized node will have one more copy of the LSA to process. In this case, the additional copies will not add any useful information but will consume “useless” CPU cycles at the transmitter and receiver.
Another problem will arise if buffer resources at the receiving end are exhausted. In this case, LSAs may be dropped and will not be seen by the CPU to process. Retransmission of LSAs under such conditions may be useful as it ensures reliable flooding. However, if performed in an uncontrolled fashion, most of the retransmissions may be useless work performed by the sender as the receiver is continuously dropping them.
In all known implementations of OSPF, the retransmission timer is fixed at system initialization. Therefore, the retransmission timer does not respond to dynamic load changes, particularly heavy loads, leading to a buildup in CPU required load at both the transmitter and receiver. The shorter the retransmission timer, the more retransmissions are likely to occur as the system becomes loaded and will lead to further load. The longer the retransmission timer, the longer it will take for the LSA to be flooded across a network if it is dropped somewhere in the network and the longer it will take for routing in the network to converge. Thus, some retransmission timer value may be good under certain situations and may be bad under others.