The Internet is becoming a fundamental tool used in our personal and professional lives on a daily basis. As such, the bandwidth demands placed on network elements that underpin the Internet are rapidly increasing. In order to feed the seemingly insatiable hunger for bandwidth, parallel processing techniques have been developed to scale compute power in a cost effective manner. Effective parallel processing techniques must be capable of scaling in order to keep up with ever increasing network line rates.
However, scalable processing techniques often introduce a variety of complexities related to effective sharing of limited resources. One such complexity is to ensure that all available compute resources in the distributed compute environment are efficiently shared and effectively deployed. Ensuring efficient sharing of distributed resources requires scheduling workloads amongst the distributed resources in an intelligent manner so as to avoid situations where some resources are overburdened, while others lay idle. A common situation of which parallel processing techniques or distributed compute environments fall victim, is head-of-line blockages. Head-of-line blockages occur when an upstream compute component is overburdened resulting in a compute blockage or bottleneck, while downstream compute components remain underutilized or even idle waiting for their turn in a processing pipeline. An effective parallel processing architecture should seek to deliver adequate compute resources in a scalable manner, while avoiding head-of-line blockages.