Processes of an operating system can communicate with one another by passing messages. In such a message passing system, processes communicate operations to one another that contain commands and data. Often, messages take the form of commands, i.e. operations that need to be executed by the operating system. Exemplary commands and data include “read(filename)” and “write(filename).” When a message includes a command to read a file having a particular file name, the operating system gathers resources, such as access to memory and access to disk drives, to perform the read operation. The event of resource gathering is called a “load” phase. Then, after the Load phase, a “modify” phase completes the read operation associated with the message. For the read operation, the modify phase completes the operation by providing the data to the requestor or returns an indication that the data could not be found.
However, during the load phase, the resources used to complete the operation associated with a message may not be available for some undesirable amount of time. Resource unavailability can occur because of resource contention, such as when other messages are busy being processed, or because of hardware failure. Specifically, if a resource is needed, such as access to memory, then a memory hardware problem may cause resource unavailability. Further, if a message is waiting for resources and cannot proceed to the modify phase, then the overall performance of the operating system may decrease. For example, if a requestor requests read access to a specific filename that is never provided because resources are not available during the load phase, then the lack of a response by the operating system to the requestor may cause dissatisfaction with the operating system. In yet another example, messages requiring responses in a fixed time period during the execution of a time critical application, such as a stock trading application, may cause dissatisfaction among the users of the stock trading application.
A solution to determine if messages are waiting for resources is to provide a list of waiting messages to requestors, e.g. if the requestor is an administrative user. For example, the operating system can provide the list of waiting messages when queried by the administrative user. However, such queries are not efficient during the real-time operation of the operating system because messages are rapidly processed. Specifically, a message may be waiting for a resource for five seconds, but the message may receive access to the resource in the sixth second. Thus, the administrative user checking the list of waiting messages every second is inefficient. Another solution is for the administrative user to offline the operating system and to implement a debugger to track waiting messages from inception to completion. However, this requires skill and knowledge of tracking messages to determine the cause of waiting messages while also causing operating system downtime.
Accordingly, what is needed is an automated system to proactively determine which messages are waiting for resources undesirable amounts of time during the real-time operation of the operating system while providing a method of mitigating delays caused by the waiting messages.