The invention relates to a host system, especially a host system having a real-time extension. Furthermore, the invention relates to a method for operating a host system. In addition, the invention relates to a program element and to a computer-readable medium.
In the processor and hardware development of computers, there is a general tendency towards ever-increased computing power. In the past, this was carried out especially also by means of increasing the clock frequencies of the processors. However, there are limits set to increasing the computing power of a processor by increasing the clock frequency especially due to the great increase in electrical power dissipation (heat generation) today. An increase by means of parallel processing (multicore system) becomes more economic. This applies not only to service systems but also to all other computer applications, e.g. also to industrial automation systems.
The demands in computing power increase continuously. Integrating motion control, programmable controllers (PLC) and human machine interfaces (HMI) in one device are a suitable task for a multicore architecture. Virtualization technology is also a further impulse for utilizing such systems.
In reality, the multiplied theoretical computing power (n-fold in the case of n cores) can never be achieved because in the distribution of software, especially of a real-time kernel from a single-core to a multi-core system, the old rules still apply with respect to the performance of the overall system:
The familiar rule “MIPS=k*memory bandwidth” means lastly that a high memory bandwidth requires local L2 caches with a higher number of cores.
Local caches cause Amdahl's Law to become effective which, in one formulation, is:                If additional processors are used, the advantages (more operating cycles) increase linearly, at the most,        the costs (conflicts of access, serialization etc.) increase quadratically.The performance of the system thus behaves as C(n)=a×n−b×n2 where n number of processors or physical cores; this is because local caches unavoidably mean a greater factor of b. This applies especially to the distribution of a real-time solution to a number of cores which are necessary as part of real-time extensions. In the distribution of central common data such as, e.g., lists, queues (especially queues which are threaded twice, e.g. thread or timer management), counting is also necessary for the case that no access conflicts (non-contention case) occur, and this twice.        
On the one hand, a spin lock (process synchronization, protects jointly used resources against modifying access) must be used as protection which, lastly, leads to an automatic RMW (read-modify-write) command which is a very expensive process with respect to the performance of the system because the cache must be blocked. If an RFO (read for ownership) cycle was also added, the negative influence on the performances would become even greater.
Changes in the delay invalidate the corresponding information in all other L2 caches and thus lead lastly to misses which have a very strong influence on the performance.
There are various solutions in existence for real-time extensions. One of these known solutions is Xenomai, a real-time extension for Linux, a further one is IntervalZero RTX®, a real-time extension for Windows.
Although, for example, Xenomai enables the real-time to be distributed to a number of cores, this solution contains the points described above which lead to a non-optimal performance. Such a real-time extension is to be described diagrammatically using the example of a standard kernel in multicore environment by means of FIG. 2.
FIG. 2 shows a host system or real-time system 200 which has a plurality of physical cores 201 and 202. Between the cores 201 and 202, the possibility of cross-core notifications is indicated diagrammatically by means of a double arrow 203, which notifications are used for providing signaling paths in which transmitter and receiver entities are located in different physical cores or processors are in different cores, respectively. In this respect, an inter-processor interrupt (IPI) is sent from one to the other core so that the function to be executed is executed by the other core by proxy.
Block 204 shows diagrammatically an operating system which provides a standard kernel having a real-time extension 205 integrated therein. The integrated real-time extension manages the global resources of the real-time system by means of a central accounting system. In this context, a synchronization is performed during access to the internal global data structures, using spin locks which are shown diagrammatically as global lock 206 in FIG. 2.
Furthermore, the operating system 204 provides real-time timers 207 and 208 for a plurality of real-time threads 209, 210, 211 and 212 which belong to a real-time application 213.
In order to be able to meet fundamental real-time requirements, no central timer chip is used in known multicore systems. The minimum is a core-specific timer management as is implemented also in Xenomai. Furthermore, it can be ensured, such as, for example, in the case of Xenomai, that in the case of a thread migration, a possible timer request also migrates. This makes it possible that, when a timer is triggered, the real-time thread to be woken runs in the same core which also has processed the timer interrupt.
Previously, however, no solutions are known which, in the case of a real-time extension, ensure by means of an optimum distribution of the software that a maximum performance gain and minimum latency periods are the result of a real-time solution distributed over a number of cores.