In recent years, machine learning has been increasingly used to solve a number of problems. The amount of data collected for use in solving such problems has also increased in recent years. As the amount of such data increases, it can become difficult to store an entire dataset at a single location. In this manner, no single computing device may have direct access to the entire dataset needed to solve a problem. Conventional training methods for solving machine learning problems in such environments can include collecting a plurality of training data examples at a centralized location (e.g. a server device) wherein the data examples can be shuffled and redistributed evenly among the computing devices.