Processing and learning from large data sets, such as documents, text, images, and/or other scientific data, for example, have applications in various scientific and engineering disciplines. The scale of these data sets, however, often demand high, and sometimes prohibitive, computational cost. Therefore, multiple processors may be used to employ learning methods on such large data sets. While large clusters of central processing units (CPUs) are commonly used for processing large data sets, graphics processing units (GPUs) provide an alternate, and often more powerful, platform for developing machine learning methods. However, for large corpora, it still may take days, or even months, for one or more GPUs to train a particular model.