The present invention relates to a computer implemented method, data processing system, and computer program product for communicating between local or at least one remote logical partitions and more specifically to establishing, maintaining, and switching among non-redundant and redundant communications between nodes.
InfiniBand®, Remote Direct Memory Access (RDMA) and RDMA over Converged Ethernet (RoCE) are technologies for high speed connectivity between hosts and servers. InfiniBand is a registered trademark of the InfiniBand Trade Association.
There is a large existing base of servers, applications, and clients that are coded to the transport control protocol/internet protocol (TCP/IP) sockets interface for communication. TCP/IP sockets communication can be too heavy, particularly in environments where virtualization permits the application of techniques to remove overhead in processing in passing data among logical partitions. In particular, some form of response or reaction is necessary to compensate for errors that can occur in connections that are formed using alternate connection methods.
Remote Direct Memory Access communications between nodes in a network lacks transparent “end to end” redundancy, load balancing and dynamic bandwidth rightsizing in the current art. When RDMA communication paths fail they must be recovered either manually, by the operator or system administrator, or programmatically, by the endpoint applications. The latter method requires rewriting existing applications to include RDMA recovery semantics, which is burdensome. Transparent recovery from failures and/or errors in RDMA fabric could be helpful.