Due to the rapid growth of data, “big data” problems such as graph traversal are becoming increasingly important. The scale of these problems makes it infeasible to fit the complete application data into a single computing node. Instead, the massive application data is partitioned over many computing nodes, such that each computing node owns a portion of the total application data and is responsible for processing it.
In order to successfully execute the application, it is necessary for messages to be exchanged between the multiple computing nodes; however, many of these messages are duplicates. These duplicate messages consume computing resources including bandwidth on the network connecting the multiple computing nodes. Regardless, the use of multiple computing nodes to execute a single application with massive application data remains popular.