The time for execution of an application flow is governed by parameters including packaging cost (at the end host machines), transmission time, and time spent on switches (per packet overhead). The time spent on switches commonly consumes a significant amount of time, and can include actions such as look-up, moving buffers from an input port to an output port, delays between receiving a data packet and transmitting a data packet, etc. Additionally, throughput of a switch is directly proportional to the amount of data transferred per packet and the number of packets processed. Accordingly, a need exists to reduce per flow end-to-end latency via management of switching devices.