Demands by individuals, researchers, and enterprises for increased compute performance and storage capacity of computing devices have resulted in various computing technologies developed to address those demands. For example, compute intensive applications, such as enterprise cloud-based applications (e.g., software as a service (SaaS) applications), data mining applications, data-driven modeling applications, scientific computation problem solving applications, etc., typically rely on complex, large-scale computing environments (e.g., high-performance computing (HPC) environments, cloud computing environments, etc.) to execute the compute intensive applications, as well as store voluminous amounts of data. Such large-scale computing environments can include tens of hundreds (e.g., enterprise systems) to tens of thousands (e.g., HPC systems) of multi-processor/multi-core network nodes connected via high-speed interconnects (e.g., fabric interconnects in a unified fabric).
To carry out such processor intensive computations, various computing technologies have been implemented to distribute workloads across different network computing devices, such as parallel computing, distributed computing, etc. In support of such distributed workload operations, multiprocessor hardware architecture (e.g., multiple multi-core processors that share memory) has been developed to facilitate multiprocessing (i.e., coordinated, simultaneous processing by more than one processor) across local and remote shared memory systems using various parallel computer memory design architectures, such as non-uniform memory access (NUMA), and other distributed memory architectures.
As a result of the distributed computing architectures, information for a given application can be stored across multiple interconnected computing nodes. As such, retrieving the distributed information is often performed by broadcasting request messages via multicast techniques (e.g., one-to-many or many-to-many message distribution) capable of sending messages addressed to a group of target computing devices simultaneously. However, as distributed systems grow in size and scale, bandwidth and hardware (e.g., memory, processing power, etc.) availability can become strained.