Computer operation systems, in particular real time operation systems RTOS are known which provide a task-based data processing. A task is a work element of a computer processing job. A task can represent a single activity to be processed by a processor of a computer or computer system, or it can be a process comprising a plurality of sub-processes to be processed in the computer or computer system. A task can therefore be called a thread or a process as well.
In order to be processed, tasks are scheduled by the operation system. The term ‘scheduling a task’ as used in the following comprises the meaning of ‘calling for direct processing’ as well as ‘planning a task for processing’, e.g. by means of a queue or a task list. In the latter case the task might have to wait for the processing until other tasks being scheduled for earlier processing, e.g. due to their higher priority, are processed. In the latter case is it furthermore possible to re-schedule a task, i.e. to arrange an already scheduled task for an earlier or later processing.
In particular real time systems provide for a task supervision, i.e. a monitoring of the scheduled tasks, in order to detect failures. A known task supervision in existing systems is done by means of a high prioritised supervising task. Messages, called supervision messages, are sent from supervised tasks to the supervising task. The supervising task monitors dedicated timeouts for expected supervision messages. In the case of a timeout knows the supervising task that something has gone wrong and can take appropriate actions.
Common supervisor hierarchies are illustrate in FIG. 1 and FIG. 2. FIG. 1 shows a central supervisor SV 100 monitoring the tasks ‘task 1’ 110, ‘task 2’ 120 to ‘task n’ 130. FIG. 2 shows a supervision tree comprising at the root a supervisor SV 200 connected to medium layered supervisors SV1 210, SV2 220 and SV3 230. The tasks ‘task 1’ 240, ‘task 2’ 250 and ‘task 3’ 260 send their supervision messages to the supervisor SV1 210. Supervisor SV2 220 receives supervision messages from ‘task 4’ 270 and ‘task 5’, and supervisor SV3 230 gets supervision messages from ‘task 6’ 290 to ‘task n’ 295. Any medium layered supervisor 210, 220, 230 sends supervision messages to the root supervision SV 200.
Tasks can have different priorities. Often, the scheduling is done in real-time systems according priorities only, in order to meet the real-time requirements.
Usually, a monitoring of the task scheduling is necessary because several problems can appear: A task can become unavailable to the system due to a crash of this task. A task can become unavailable to the system due to an endless loop in it. A task can become unavailable to the system due to a dead lock, i.e. a blocking situation that can e.g. appear during a competition of tasks for a limited system resource. Tasks can become unavailable to the system due to a life lock. A life lock is either an endless loop of one task, or a communication between two or more tasks in a faulty, infinite way. By this other tasks are unable to get processed by the processor (CPU). Or, the faulty tasks miss to trigger other tasks, so these other tasks will not be executed anymore.
Low prioritised tasks can become unavailable to the system due overload caused by one or several better prioritised others. This might happen actually due to shortcomings in the design of the task processing system, because in a proper design will a task not prevent other important, but lower prioritised tasks from being executed, even under overload situations.
Usually not all tasks in a system are part of the application. Instead, some are part of a system's platform. Usually, the source code of such platform tasks cannot be changed.
With supervision messages it is hard to find out what exactly went wrong in the case of failure, i.e. which task or tasks failed. If tasks become unavailable due to a life lock or overload as described, a supervisor monitoring the task scheduling by means of supervision messages can detect only that one of the worse prioritised tasks are dead, but cannot determine that there is a life lock or overload, although it is important for the supervisor to detect what went wrong, in order to start appropriate actions to solve the problem.
As a further disadvantage, it is impossible to include platform tasks in a supervision hierarchy as described with reference to the FIGS. 1 and 2, unless the platform offers certain interfaces, which might cause extra costs or which might in particular in security-related systems not be wanted.
Takashi et al (“Trace visualization and analysis, tool for supervisory control systems”, Systems, Man and Cybernetics, 2000 IEEE International Conference, ISBN 0-7803-6583-6, p. 1198-1203) refer to a task-based visualisation and analysis tool for supervisory control systems. Event patterns are identified, which appear in traces repeatedly such as periodic events. An identified event pattern is represented in a visualized trace diagram by means of an assigned phrase instead of the plurality of events that build the pattern. A corresponding visualization of traces provides a reduced amount of information displayed to a supervising user.
Therefore, it is an object of the invention to provide a method, device and computer program for improved task supervision in a task-based data processing system.