A basic pipeline usage module applies when a compute element is required to receive a message, modify some portion of it, and then send the message out to some other target (e.g., node) within the systolic array. The overhead of message passing burdens the processor as it interacts with hardware direct memory access (DMA) engines or message passing interfaces, reducing cycles available for other work. Delays in message passing may significantly reduce performance for highly optimized pipelines.