The present invention relates to the use of multicasting and in particular to a method and system for distributing reliable in-order events using multicast.
Computers may be connected to form a variety of networks. Networking is generally a configuration of computers, software, and communication devices connected for information interchange. Devices connected to the network may be workstations, servers, bridges, routers, and other various devices. Typically, computer application programs use a variety of network protocols to communicate with each other over networks. Such network protocols may include, for example, asynchronous transfer mode (ATM) protocol or transmission control protocol/internet protocol (TCP/IP).
A networking system may be divided into a number of links. A link may be a local area network (LAN) with each LAN capable of supporting several computers. In addition, components of a networking system may be connected remotely to form wide area networks (WAN).
Typically, a network protocol consists of up to seven layers: a physical layer, a datalink layer, a network layer, a transport layer, a session layer, a presentation layer, and an application layer. Information packets are formed in the higher layers and are passed down through the lower layers until they reach the network layer. In the network layer, headers are added to the information packets and may include, for example, destination addresses to which the packets are to be sent. The packets are passed to the physical layer, which transmits the packets onto the bus. The system forwards the packets link-to-link until they are retrieved at their destination according to the destination address contained in the header. Depending on the protocol used, the packets may be frames or cells.
Typically, in an information processing subsystem of the networking system, multiple hardware modules are used to distribute information. One or more of these modules may be responsible for critical functions which need to be available at all times for the proper functioning of the networking system. An example of this module may be a central processing module that segments information into frames or cells and/or reassembles frames or cells into information. Typically, the information is contained within a packet payload. An example of a central processing module may be an ATM processing module. ATM uses any received information whether it be data, voice, or image, and any speed rate of the information, segments the information into predetermined fixed-length packets (i.e., payloads), and attaches a header to the individual payloads so that each payload may be sent to its destination in the networking system. In the distribution mode, the ATM central processing module processes information into cells and the cells are sent to the various distribution modules to be transmitted over the networking system. A failure of the ATM processing module generally results in the failure of the link for that portion of the networking system.
To prevent such failure, hardware modules with critical functions, termed server modules, may be supported by one or more hardware redundant modules with similar functionality. Typically, one of the modules, termed a primary server module, is chosen to actively provide the critical functions in a networking system. One or more additional modules with similar functions, termed secondary modules, are present as backup for the primary module. If the primary module fails, the secondary module detects the failure condition and takes over to become the primary module.
In distributed real time networking systems, events taking place on the network need to be delivered to all the connected modules (clients) in the system that need to be aware or use the event. Typically, there are several clients for a particular networking event. Such events may include, for example, an indication that a module has become connected to the system or is no longer connected to the system, an indication that a server is switching from primary to secondary status or from secondary to primary status, an indication that a change in a system configuration is required (for example, resetting the time of day), and an indication that a device connected to the network is partially available. In addition, in a fault tolerant network system, the networking events need to be delivered reliably and, in most cases, in the order of occurrence of the event.
In many networking systems, reliability may be achieved using the retransmission and a pair-wise unicast between a server and the clients. These systems typically use an acknowledgment routine to indicate that the transmission of the event has been sent and received properly. Further, the total number of messages to distribute an event is 2n (one message to each client and one acknowledgment back). The number of messages exceeds 2n if there is need for retransmissions (due to loss of messages). However, retransmission may not satisfy the real-time needs of event delivery. Real-time needs may be achieved using a multicast protocol. Multicast protocol or multicasting is a one-to-many transmission that provides a method for a server to send packets to a group of client modules within the networking system. By using multicasting, a single message is sufficient to transmit information (an event) to all clients. However, in order to assure reliability, the number of messages in the best case is still n+1 (one outbound and n return messages) in an n-client system.
In Internet Protocol Multicast (IPM), a best effort delivery service is provided. IPM assumes that all clients can handle the multicast packets being sent. However, IPM does not provide a reliable real-time system as IPM does not guarantee that all sites receive the packets and performance degrades as the number of clients grows. In addition, IPM does not guarantee that packets do not overrun slow clients. Within a multicast group, some clients may be slow to receive and process events while others are fast and the server must find an optimal transmission rate that accommodates all of the clients during a multicast transmission so that slow clients reliably receive events in-order. Further, IPM does not provide a mechanism for the detection of lost or corrupted event packets.
A method and system for maintaining reliable in-order distribution using multicast are described. A trigger event is sensed and a multicast payload containing a plurality of queued events and a packet sequence number in response to the trigger event is created. Further, a current multicast packet is transmitted to at least one receiving device, the current multicast packet containing the multicast payload. The multicast packet is received from the transmitting device and a transmission count of each of the plurality of queued events is examined. In addition, a plurality of current queued events of the plurality of queued events and a plurality of missed transmission queued events of the plurality of queued is processed.