In general purpose operating systems, low latency media streaming is difficult to achieve, despite some recent progress. What is critical is not the average latency, but the latency for some high percentage (>>99%) of the material to be rendered. An underlying issue is that streaming media applications, as typically implemented currently, require multiple, separately clocked processes.
Consider Voice over Internet Protocol (VoIP) speech communication. Typically, a soundcard running on its own clock consumes sound samples from a rendering buffer. Separately, the soundcard clock or, alternatively, an Operating System (OS) clock, triggers a jitter management routine at a fixed periodic rate, n. (Typically, n=20 msec.) This routine consumes data from a jitter buffer, and results in sound samples being placed in the rendering buffer. The rendering buffer therefore needs to be a swapped double-buffer, to ensure that the soundcard isn't locked out of reading samples when the jitter manager is placing those samples in, and vice-versa. Separately, a network interface controller (NIC) receives an incoming speech packet at some unknown time. This arrival triggers a copy of the packet from the NIC into the previously mentioned receive buffer. Alternatively, in the cases of some operating systems, a NIC buffer is instead copied by a separate process into the jitter buffer. The OS may or may not perform a real copy, and the buffers may or may not introduce double-buffer latency. (Typically, there is actually another clock in the NIC listening to the physical layer on the wire, and decoding material into an internal NIC buffer. We can ignore this clock and buffer because it introduces little latency and is unavoidable.)
Calling a jitter management routine frequently is problematic because of the inherent tension in the technique. One wishes to minimize latency, so for that reason, the buffer management routine should be called as late as possible, to allow for all last-moment packet arrivals. However, one wishes to maximize the smoothness of playback, so for that reason, the buffer management routine should be called as early as possible. If the buffer management routine is called too early, material that has arrived on time for rendering will be considered late. If the buffer management routine is called too late, every glitch in the clocking of routines will result in audible artifacts. Unfortunately, the OS scheduler is responsible for reacting within this narrow time slice. What is needed is a novel method for avoiding the last-moment callback needed to give a packet as much of a chance to arrive “in time” while also minimizing the overall latency.