The execution of different jobs (or tasks) in parallel by a plurality of processors, generally provided with corresponding First-In First-Out (FIFO) queues, is a technique that is widely used to improve performance of a corresponding computing system (for example, a network switch). Generally, the jobs are any computing (sub) activities that may be executed independently to implement larger computing (complex) activities (for example, processing of data frames of network messages), and then the jobs may be executed at any time by the processors (for example, cores of a microprocessor). Therefore, the jobs may be submitted for execution to the processors so as to obtain execution of the jobs simultaneously as far as possible. Typically, the jobs are distributed to the processors in an attempt to optimize use of the processors (for example, in terms of idle times, load balancing).
However, the parallel execution of the jobs does not allow controlling the execution order of the jobs. Indeed, since the processors execute the jobs independently, a job that has been submitted after another job may nevertheless be executed before the other job (for example, when the corresponding processor is less busy).
Classes may be associated statically with the processors, which may cause a (random) load unbalancing of the processors since the jobs of each class are submitted always to the same corresponding processor even if the same corresponding processor is busy and other processors are instead idle.
Conversely, it is possible to have more classes than processors and then to assign the classes to the processors dynamically according to the current workload of the processors. For this purpose, a dispatcher (controlling the submission of the jobs to the processors) has to track the execution of the jobs so as to identify (at any moment) a number of the (pending) jobs of each class that have been submitted for execution but that are still waiting to be executed. In this way, when the number of pending jobs of a class is zero, each new job of the same class may be submitted to whatever processor (to optimize load balancing); otherwise, each new job of this class has to be submitted to the same processor on which the pending jobs of the same class are still waiting to be executed (to respect possible execution dependencies).
However, this execution tracking of the jobs may be not possible when the processors are unable to notify the completion of the execution of the jobs to the dispatcher (for example, when the queues of the processors are inherently mono-directional, such as in case of execution of the jobs by sending data frames onto a network).
In any case, even when the execution tracking is possible, the corresponding notifications of the completion of the execution of each job by each processor to the dispatcher generates a heavy exchange of information (among the processors and the dispatcher), which may result in significant inefficiency. Particularly, these notifications may cause a contention bottleneck at the dispatcher, with detrimental effects on its performance (and then of the whole computing system).
Moreover, the execution tracking of the jobs by the dispatcher requires the allocation of a corresponding storage area thereof; this storage area may become relatively large when the number of classes/processors increases.