And processors, particularly those employed in embedded applications, often multi-task between a number of cooperative software tasks that interact with specialized hardware that operates in parallel. Because the results of one task/hardware enables processing in other tasks/hardware, effective scheduling to delivery the best throughput and latency has to take into account the real-time progress that each software task or simultaneously operating hardware is making in its process. A key to delivering this effective scheduling is to dynamically track the status of each task, and using the information to continuously address the priority of each of the pending tasks. In relation to the foregoing and for purposes of describing the present invention, the term “thread” is intended to mean a program that can execute independently of other programs within the system.
Referring now to FIG. 1, there are shown three threads, T1, T2, and T3. T1 produces data that is fed into thread T2. A queue, Q12, is used to store the output data from T1, so that the threads T1 and T2 need not operate in exact lock step. Similarly, the thread T2 produces data consumed by thread T3, and this data is stored temporarily in queue Q23. In the example shown in FIG. 1, it is assumed that the data in each queue takes up space that is only freed when the data is consumed. An effective thread schedule may base its scheduling decisions on the amount of data in each queue. When a predetermined amount of data is held in a particular queue, it is time to schedule to have that data consumed. Conversely, when there is little data left in a queue, then it is time to schedule the producer of data for that particular queue.
In a typical processor, a large number of threads will be present, with some of these threads actually working together with specialized, simultaneously operating hardware to implement various functions. In the example shown in FIG. 1, the thread T3 may be the device driver for an Ethernet device that consumes data from the queue Q23. Additionally, the thread T2 may be enabled by the completion of DMA performed by specialized DMA hardware that is programmed by thread T1.
This embodiment is shown in FIG. 2, with the elements that T1, T2, T3, Q12, and Q23 taking the same meaning as FIG. 1. Accordingly, there is parallelism in the overall system even while software threads share processor power in a multi-tasking fashion. In order to obtain the maximal throughput, the software threads should be scheduled in a way that ensures that the parallel hardware devices are continually fed work.
Traditional scheduling, using fixed prioritized interrupts of the main processor and software based thread scheduling, is unable to achieve this flexibility. Hardware interrupt scheduling is typically performed using fixed priority as determined by the interrupt source. In the example shown in FIG. 1 and FIG. 2, however, whether an interrupt triggered by the DMA hardware or an interrupt triggered by the Ethernet device should have higher priority may depend, in any given instance, on the status of the queues Q12 and Q23.
Accordingly, a typical prior art software thread scheduler, while capable of more sophisticated scheduling decisions, runs the risk of incurring high overhead in evaluating the desired scheduling functions. In the interest of efficiency, most software thread schedulers employ several prioritized thread/task queues, and decide at the time a thread/task is suspended which queue it is placed into. Note that the terminology of placing a thread into a scheduling queue is a conventional terminology which means putting data that identifies a thread into the queue. The entity that decides what thread to run will examine the queue, and use the information stored therein to cause the relevant thread to run.
The typical software thread scheduler does not deliver all of the functions desired; in particular, it is unable to dynamically prioritize interrupts in order to decide whether to interrupt the current thread. The only way to “fake” this is to periodically interrupt the executing thread and evaluate the scheduling decision. When the quantum of work scheduled is small relatively to the cost of an interrupt, this is not a feasible solution.