A reliable peer-to-peer communication system is a communication system in a distributed data processing environment that provides reliable data communication services. Distributed applications, such as instances of a distributable application executing on different nodes or machines of a data processing environment, utilize these data communication services for providing their functionality. Client systems, such as an application that is a client of a distributed application instance, also inherently depend on these data communication services to accomplish their desired functions with the distributed applications.
For example, reliable data communication services include functions commonly required by distributed applications, such as reliable message delivery to all members of a domain, global in-order delivery of messages or sequences of messages, and synchronization barriers. Such services are used by currently available distributed applications.
A domain is a collection or a set of data processing systems connected by a network, data bus and/or shared memory or storage that participate in a given distributed data processing environment. For example, a data processing environment may include five computers, three of which may host instances of a distributed application. The three computers, also known as hosts or nodes, which host the distributed application instances, form a domain that has to provide the aforementioned reliable data communication services.
A barrier is a type of synchronization method. A barrier for a group of threads or processes is a stopping point where the threads or processes subject to the barrier must stop executing to allow other threads or processes to catch-up or synchronize at the barrier, before the threads or processes can resume executing. Various nodes in a domain, and distributed application instances executing thereon, have to remain synchronized with each other to provide their functions in a consistent manner. In some cases, additional functions, such as multi-phase protocols with global barriers, zoning (creation of sub-domains) and distributed locking may be offered by data communication services in a distributed data processing environment to satisfy the synchronization needs of the distributed applications.
Reliable peer-to-peer communication is a type of distributed data communication service in a distributed data processing environment that seeks to provide a threshold level of reliability in message delivery between the peer nodes in the distributed data processing environment. Many distributed applications use reliable peer-to-peer communication to provide a particular level of performance, functionality, stability, or security.
For example, distributed transaction systems require reliable peer-to-peer communication to ensure transaction integrity. As another example, distributed databases and distributed file systems require reliable peer-to-peer communication to ensure data consistency across the various data instances or partitions. Clusters of data processing systems in high availability (HA) data processing environments rely on such peer-to-peer communications to maintain the desired level of system availability, load balancing and system performance. Logistics, telecommunication, and industrial control systems are some examples of types of distributed applications, which require reliable peer-to-peer communication services for ensuring a reliable delivery of their respective functionalities.