Field
The described technology relates to a technology for optimizing network performance, and more particularly, to a system and method for optimizing network performance using data transmission/reception software in a computing environment using distributed processing.
Description of the Related Art
In a high-performance computing environment which is typically based on distributed computing, data transmission and reception between distributed computing nodes is necessary, and thus efficiency of such data transmission and reception is also an important factor in the distributed computing.
As a representative method in which each computing node transmits and receives data in the distributed computing environment, there is MPI (message passing interference), and there exist a variety of software libraries in the form of an open source for implementing MPI.
As a representative MPI library, there are MVAPICH, MPICH, and the like, and most MPI libraries use an eager protocol and a rendezvous protocol in parallel as a data transmission/reception protocol.
The eager protocol is suitable for rapidly transmitting a relatively small size of data because it does not need a handshake process at the time of data transmission/reception, and unlike the eager protocol, the rendezvous protocol is suitable for accurately transmitting a relatively large size of data because it should be subjected to the handshake process.
Existing MPI libraries have used a fixed data size to switch between such eager protocol and rendezvous protocol. That is, the eager protocol has been used in the exchange of data having a size smaller than the fixed data size, and the rendezvous protocol has been used in the exchange of data having a size larger than the fixed data size.
FIG. 1 is a flowchart illustrating a method for setting a protocol in order to use eager/rendezvous protocols in a complementary manner in such a general MPI library.
Whenever there is a data transmission request, a size of data to be transmitted may be compared with a preset fixed threshold value, so that the eager protocol may be used when the size of data to be transmitted is smaller than the threshold value, and the rendezvous protocol may be used when the size of data to be transmitted is larger than the threshold value. As the threshold value used in the comparison, a fixed value that does not consider characteristics of a network and a topology may be used.
However, in the high-performance computing environment, an optimal operation switching criteria should be changed depending on a network technology for connecting distributed nodes and a constituted network topology.
Nevertheless, the existing MPI libraries have a fixed protocol switching criteria which does not consider network characteristics and the fixed protocol switching criteria also depends on manual setting, and therefore the optimal operation switching criteria which depend on a change in the network performance cannot be set.