In the field of data transmission, an increasing number of application systems use message notifications as their mode of interaction. During message transmission, a message may be lost in the middle of its transmission and could not be timely and reliably delivered because of a failure in transmission links or systems. To solve this problem, various message resending mechanisms are developed and used for resending unsuccessfully delivered messages. In these resending mechanisms, attempts are made to send the message again if the first delivery fails. This can resolve the problem of undeliverable messages to a large extent. However, for medium and long term communication failures, too many resending attempts may be made which not only waste time but also use up system resources and cause low delivery efficiency.
In the design of network protocols, many retransmission backoff algorithms targeting the failures of data transmission have been developed. The most popular is Binary Exponential Backoff (BEB) algorithm. In the following, the Ethernet 802.3; protocol is used as an example to illustrate this algorithm in details.
The 802.3 protocol uses CSMA/CD (Carrier Sense Multiple Access with Collision Detection) algorithm to solve the problem of data collision on a shared channel. The algorithm constantly listens to the channel until the channel becomes idle. Once the channel is idle, the data is sent instantly. If data collision occurs, the sending will stop immediately. An attempt of resending is then made after a retry period. The retry period is calculated using Binary Exponential Backoff algorithm. The algorithm first divides time into slices with each time slice of t (51.2 millisecond) long. At the ith collision, the retry period will be set as a time randomly chosen between 0 to (2i−1)*t, which has a time length of an integral multiple of t.
Binary Exponential Backoff algorithm allows the retry time interval (period) to increase exponentially when a failure occurs in data transmission. However, such algorithms for calculating the retry time intervals based on a random failure pattern is not suitable for regular failure pattern of Internet intersystem message notifications, and often cause a longer time for failure recovery. Usually, the causes for a failure in Internet intersystem message notification are overtime transmission due to a busy network, a failure in network system, a failure in application system, or scheduled system shutdown for maintenance, etc. Although the occurrences of these failures are random, the failure recoveries are not and have certain patterns, because failure recoveries are intervened by human detection and maintenance. Moreover, the time for failure recovery over the Internet is normally measured in hours and days. During that period of time, a large amount of unsent messages may be accumulated. Therefore, highly efficient and flexible algorithm and system for message storage, management and retransmission scheduling are needed. It is very meaningful to develop a method and a system that are suitable for message retransmission over the Internet while at the same time achieving a flexible balance between occupancy of system resources and timely recovery of message notification.