1. Technical Field
The present invention relates to the scheduling Jobs in a multitask operating system.
2. Description of Related Art
In the last decade microarchitecture design innovations such as speculation and out-of-order superscalar execution dramatically improved microprocessor efficiency and accordingly performance gains. One such innovation is the Multi-core/multi-threaded on a chip which allows multiple cores to be integrated on a single chip with different levels of sharing chip resources such as memory sub-system known as chip cache memories. In multi-core/multi-thread processors, an Operating system is used to schedule multiple jobs/threads to run concurrently on the underlying platform of multiple logical processors, each on a logical processor. In order to efficiently share the machine resources (specifically the cache system among multiple jobs/threads) an OS job scheduler should be optimized.
Jobs/threads may share machine resources either constructively or destructively based on many factors. Choosing the best set of jobs to share chip resources is a vital step to machine performance and hardware resource utilization. A naive job scheduling will result in inefficient resource allocation, poorly utilized machine resources, and as a result, sub-optimal machine throughput. The dominant factor in all of the job scheduling proposed is to schedule the best candidate applications for using the underlying hardware. The measure of success for a job scheduling should not only take into account the machine throughput but also should avoid job starvation. In reality, applications running concurrently are not equally important. Some applications may require hard deadlines to finish their tasks such as real time applications. OS designers have given the users many levels of priorities for their applications to define the importance of the running jobs. In single thread processors achieving the job/thread priorities is very simple and is achievable by simple time quantum sharing of the microprocessor. On the other hand achieving the same goal in multi-core/multi-threaded machines is not that trivial, given the significant level of sharing of the machine resources. One other solution to this problem is the one taken in Patent Application EP08172360.3 (Applicant's reference FR920080216), in which a comparable approach to the one invented here is taken but with different conditions and at a finer granularity level. In Patent Application EP08172360.3 (Applicant's reference FR920080216) sharing of the Fetch bandwidth of SMT processors is controlled using a thread selection and scheduling method. However, in Patent Application EP08172360.3 (Applicant's reference FR920080216) the assumption is that only a limited number of jobs/threads, running on the computer system, exist. This number should be less than or equal to the number of logical processors (hardware contexts) the processor could run/execute simultaneously. Modern OS kernels are built with time sharing capabilities to facilitate scheduling more jobs to run on the system than the hardware contexts. The OS kernel is in charge of taking the decision to context switch jobs/threads to share the hardware resources among so many jobs/threads running on the system. That means control fetch bandwidth utilization may not always be enough to control sharing of the machine resources, since the OS decisions to schedule jobs/threads concurrently to run on the system will directly affect machine throughput (given the high level of resource sharing in Multi-core/multi-threaded processors). Accordingly, OS job scheduling should use both machine performance/throughput measurements as well as software priority during its scheduling process.
In the article entitled Symbiotic Job scheduling with Priorities for a Simultaneous Multithreading Processor. Sigmetrics 2002 by Allan Snavely, Dean M. Tullsen, and Geoff Voelker [Allan], SMT processors could run in either single threaded or multithreaded (SMT) mode. The time quantum for each application, defined by its priority, is divided into two parts: one for the application to run with others in SMT mode and the second for the application to run alone to achieve its time quantum. There are at least two drawbacks for using this method. First, running an application in a single thread mode on an SMT machine for a full context switch period may result in a significant waste of the SMT machine resources that could be utilized by other threads/applications. A more fine grained mechanism is required to achieve both machine-efficient utilization as well as application priority. Second, the mechanism described to reduce the waste in machine resources relies on SOS (Sample, Optimize, Symbios) job scheduling which has scaling limitations.
Another solution to the thread priority problem in SMT processors is presented in U.S. Pat. No. 6,658,447. In the referenced disclosure, thread hardware execution heuristics are scaled based on each application OS priority. The success of the technique is highly dependent on the success in finding the appropriate scaling function as well as the simplicity of that function to be implemented in hardware to get cycle by cycle feedback information which is not guaranteed. Another drawback is that scaling may completely invert the hardware priority in favour of OS priority or vice versa. It could be argued that the dynamic change of the scaling functions could compensate for that effect but this will make it even harder to implement and run on a cycle by cycle basis. Also there is no guarantee that the high priority threads will actually achieve their assigned time quanta for hard deadline applications. In US2006/0184946 a method is implemented to strictly achieve the OS thread priority for two threads on a cycle by cycle basis without any consideration to the SMT machine performance or throughput.