Many computer-implemented applications are computationally intensive and require parallel processing. For example, in machine learning or supervised learning, deep neural network (DNN) training techniques typically involve repeated and intensive multiply-accumulate operations with high-latency memory calls. Resistive Processing Units (RPU) are arrays of memory elements that combine processing and non-volatile memory enabling multiply-accumulate operations to be done efficiently in parallel and without extensive data movement between the processor and memory.