Technical Field
The present invention relates generally to data processing and, in particular, to low latency scheduling on simultaneous multi-threading cores.
Description of the Related Art
Several critical processing jobs require low latency scheduling, namely that they need to be dispatched as soon as they become runnable (for instance, when they wake up from a sleep or are interrupted on input/output (I/O) completion). Examples are heartbeat daemon threads, real-time streaming, and threads that manage devices with small data buffers. If such threads do not get dispatched fast enough, the consequences can range from a system outage or node down to loss of data.
The general mechanism to support low latency scheduling is to increase the Unix priority of a thread so that the thread gets dispatched preferentially over other threads in the system, to put the thread on a global run queue serviced by multiple central processing units (CPUs), and so forth. Also, the thread usually pins all its text and memory to avoid page faulting. Despite the preceding, there are several issues that can and often are encountered in low latency scheduling.
One such issue is that if there are no available CPUs, it can take up to 10 milliseconds until the next time slice interval for a CPU to notice the new high-priority runnable job.
Another such issue arises when an inter-processor interrupt (IPI) is sent to one or more CPUs (sending an IPI to all CPUs each time a low latency thread becomes runnable can be expensive and potentially cause scaling problems). However, the CPU(s) may still take time to respond because it (they) could be busy stuck in some disabled critical section or under an interrupt storm and may not be immediately available for dispatching the low latency thread.
Yet another such issue which can arise, and is even worse than the preceding issues, is when the low latency thread is picked by a CPU for dispatching, but the CPU may then receive an interrupt and take a while to resume the thread. Disabling interrupts on the current CPU each time the CPU wakes up is expensive (it will take longer to execute its real work) plus the thread may still get interrupted before it makes the call.
Still another such issue is that it is not clear how to manage or choose between multiple low latency threads when they become runnable at the same time.
Thus, there is a need for improved low latency scheduling on simultaneous multi-threading (SMT) cores.