The present invention relates generally to computer programming and more particularly to a system and method for tracking events handled by an operating system kernel.
As operating systems have become more sophisticated, and as the environments with which operating systems operate have become more complex, it has become important to monitor the actions taken by an operating system in response to events the operating system is tasked to handle. In particular, visibility into the internal workings of an operating system kernel can be important to activities including, but not limited to, developing, debugging, and/or diagnosing an operating system. For example, tracking the number and type of events that an operating system handles can be important to diagnosing system problems, bottlenecks, malfunctioning equipment and/or component, and capacity problems.
Conventionally, monitoring operating system events has been difficult to achieve. Even if possible to log events handled by an operating system, such logging negatively impacted the ability of the operating system to handle events. For example, when an operating system experienced a problem, support personnel may have considered monitoring the events handled by the operating system to diagnose the problem. But the processing required to log an event may have taken more time to perform than handling the event, and thus the operating system could not be monitored under conditions that produced the problem, thus reducing the relevance of the monitoring. Similarly, when testing an operating system, test conditions that stress the operating system to a point where bugs appear can overwhelm a conventional monitor, and events may not be logged or requests may not be handled and thus the problem may remain undiagnosed. Further, contemporaneously logging interrupt events and non-interrupt events frequently lead to interrupt data interfering with non-interrupt data, and thus accurate monitoring was difficult. Thus, the monitoring may not have been performed and operating system improvements may not have been achieved.
Even if operating system monitoring is performed, it can be difficult to understand complex interactions between entities including, but not limited to, threads, processes, mutexs, locks, events, device drivers and applications. The value of operating system monitoring can be limited if the data produced during such monitoring is difficult to understand.
Thus there remains a need for a minimally intrusive system and method to accurately monitor events handled by an operating system, where the data produced by such monitoring is viewable and facilitates understanding the running of the operating system.
The following presents a simplified summary of the invention in order to provide a basic understanding of some aspects of the invention. This summary is not an extensive overview of the invention. It is not intended to identify key/critical elements of the invention or to delineate the scope of the invention. Its sole purpose is to present some concepts of the invention in a simplified form as a prelude to the more detailed description that is presented later.
The present invention relates to a system and method for logging events processed by an operating system. The events can include both interrupt and non-interrupt data. When an event is logged, the present invention stores data associated with interrupts in an interrupt buffer and stores data associated with non-interrupt events in a non-interrupt buffer. By writing small amounts of data to separate memories and by not attempting to transport data concerning the event to other processes (e.g. data viewer, disk log, network) during the event handling process, the logging of the events is less intrusive, and less likely to corrupt nearly simultaneous events, thus mitigating problems associated with conventional monitoring systems.
Both the interrupt buffer and the non-interrupt buffer can be flushed to a secondary buffer during non-event handling time. The interrupt buffer can be flushed upon completion of the interrupt handling process and the non-interrupt buffer can be flushed upon completion of the execution of the event-handler and/or upon the non-interrupt buffer becoming substantially full. One or more processes (e.g. a viewer) can access the secondary buffer. The secondary buffer can also be flushed to one or more devices and/or processes (e.g. a viewer, data communication devices, disk) when the secondary buffer becomes substantially full and/or when a timeout condition occurs.
In one aspect of the present invention, the interrupt buffer, the non-interrupt buffer and the secondary buffer are located in memory associated with the computer running the operating system being monitored. In an alternative aspect of the present invention, the interrupt buffer, the non-interrupt buffer and the secondary buffer are located in memory associated with a separate hardware probe. When the buffers are located in memory associated with computer running the operating system being monitored, flushing the buffers may be accomplished under software control. But when the buffers are located in memory associated with a separate hardware probe, then flushing the buffers may be accomplished under hardware and/or software control.
By logging operating system kernel events, visibility into the internals of the operating system kernel is achieved. Such visibility facilitates actions including, but not limited to, debugging, programming and maintaining operating systems, hardware components, software components and/or applications. By employing the triple buffering method described above, acquiring such visibility is less intrusive to the operating system being monitored. Initially storing the data associated with interrupt events and the data associated with non-interrupt events in separate buffers mitigates problems associated with interrupt data corrupting non-interrupt data. Subsequently merging the interrupt data and the non-interrupt data during non-event handling time reduces the impact of the monitor on the operating system event handling performance and facilitates chronologically ordering the interrupt data and non-interrupt data in the secondary buffer. To facilitate such chronological ordering, when the flushing of the buffers is under software control, a time stamp can be associated with events as they are written to the buffers. The time stamp can then be used to arrange events in chronological order, and to ascertain the amount of time spent processing one or more events.
Monitoring an operating system may be undertaken for many reasons. For example, an operating system can be monitored to determine the effect of adding a new piece of hardware, the effect of adding a new software component, and/or to determine the amount of operating system resources being allocated to handle certain events. Thus, a plurality of events can be logged by the present invention to facilitate customizing monitoring an operating system. A user interface provides means for an administrator to select events that are to be logged by the monitor. An Application Programming Interface (API) similarly provides means for processes to select events that are to be logged by the monitor. Since users may be interested in events not defined by the present invention, users may define events that are to be logged, and thus the present invention provides means for defining such events, which can then be selected via the UI or API.
The value of monitoring an operating system can be increased if the data produced by such monitoring is viewable in a manner that facilitates understanding the running of the operating system. Thus, the present invention includes a component and method for runtime display of monitoring data. Such a runtime display of monitoring data facilitates visualizing time based thread interactions, and system events associated with the interacting threads. The runtime display facilitates viewing items including, but not limited to, process states, thread states, process events, thread events, process switching, thread switching, events, changing states of semaphores, changing states of mutexs, entering/leaving a critical section, and the occurrence of user defined events. Since numerous items can be viewed, the present invention provides for filtering the events that can be viewed. For example, an administrator may desire to view events associated with entering/leaving a critical section, and may not desire to view other events. The ability to view such events facilitates diagnosing problems associated with programs including, but not limited to, operating systems, device drivers and/or applications. Problems whose diagnosis can be facilitated include, but are not limited to, deadlocks, and missed real time deadlines. The ability to filter events to be viewed facilitates viewing a smaller and/or more focused set of data, which can improve the ability to diagnose such problems.
Since an operating system can be monitored for different reasons, in an exemplary aspect of the present invention, the component and method for runtime display of monitoring data includes a User Interface (UI) to facilitate searching the monitoring data for items including, but not limited to, processes, threads, process states, thread states, process events, thread events and categories of events. For example, a first person examining monitoring data may desire to see a category of events, (e.g. interrupt events) associated with all processes, while a second person examining monitoring data may desire to see data associated with events associated with a particular process.
An operating system can be monitored for different reasons at different times. Thus, in an exemplary aspect of the present invention, the component and method for runtime display of monitoring can be run in one of at least three different modes: a real time mode, a system log of a previous run mode and a limited buffer mode. The different modes facilitate monitoring an operating system for different reasons at different times.
In accordance with an aspect of the present invention, a system for logging operating system events is provided, the system comprising: a first memory for storing first data associated with an interrupt event; a second memory for storing second data associated with a non-interrupt event; one or more components for storing data in the first memory and second memory; one or more components for moving the first data from the first memory to a third memory and the second data from the second memory to the third memory; and a first flushing component for flushing data from the third memory upon at least one of the third memory becoming substantially full and a timeout condition occurring.
Another aspect of the present invention provides a system for logging operating system events is provided, the system comprising: a first memory for storing first data associated with an interrupt event; a second memory for storing second data associated with a non-interrupt event; one or more components for storing data in the first memory and second memory; one or more components for moving the first data from the first memory to a third memory and the second data from the second memory to the third memory; and a first flushing component for flushing data from the third memory upon at least one of the third memory becoming substantially full and a timeout condition occurring, wherein at least one of the first data in the first memory and the second data in the second memory includes a time stamp associated with when the event occurred.
Yet another aspect of the present invention provides a system for logging operating system events is provided, the system comprising: a first memory for storing first data associated with an interrupt event; a second memory for storing second data associated with a non-interrupt event; one or more components for storing data in the first memory and second memory; one or more components for moving the first data from the first memory to a third memory and the second data from the second memory to the third memory; and a first flushing component for flushing data from the third memory upon at least one of the third memory becoming substantially full and a timeout condition occurring, wherein the non-interrupt events include user-defined events.
Yet another aspect of the present invention provides a user interface, the user interface operable to facilitate selecting one or more events to be logged.
Still yet another aspect of the present invention provides an Application Programming Interface (API) operable to facilitate selecting one or more events to be logged.
Another aspect of the present invention provides a method for logging operating system kernel events, comprising: storing interrupt data associated with interrupt events in an interrupt buffer; storing non-interrupt data associated with non-interrupt events in a non-interrupt buffer; flushing the interrupt data to a secondary buffer; and flushing the non-interrupt data to the secondary buffer.
Yet another aspect of the present invention provides a method for logging operating system kernel events, comprising: storing interrupt data associated with interrupt events in an interrupt buffer; storing non-interrupt data associated with non-interrupt events in a non-interrupt buffer; flushing the interrupt data to a secondary buffer; and flushing the non-interrupt data to the secondary buffer, wherein the interrupt and non-interrupt events to log are determined by at least one of choices made in response to a User Interface (UI) and one or more calls made to an Application Programming Interface (API).
Yet another aspect of the present invention provides a method for logging operating system kernel events, comprising: storing interrupt data associated with interrupt events in an interrupt buffer; storing non-interrupt data associated with non-interrupt events in a non-interrupt buffer; flushing the interrupt data to a secondary buffer; and flushing the non-interrupt data to the secondary buffer, wherein flushing the non-interrupt data to the secondary buffer further comprises re-flushing the non-interrupt data to the secondary buffer upon a determination that the flushing did not complete.
Still yet another aspect of the present invention provides a computer readable medium storing computer executable components of a system for triple buffering information associated with events handled by an operating system kernel, comprising: one or more computer executable components for storing first data associated with interrupt events in a first memory; one or more computer executable components for storing second data associated with non-interrupt events in a second memory; one or more components for moving the first data from the first memory to the third memory and the second data from the second memory to a third memory; and a flushing component for flushing data from the third memory upon at least one of the third memory becoming substantially full and a timeout condition occurring, the flushing not occurring during interrupt handling time.
Still yet another aspect of the present invention provides a computer readable medium storing computer executable components operable to execute a method for logging operating system kernel events, the method comprising: storing interrupt data associated with interrupt events in an interrupt buffer; storing non-interrupt data associated with non-interrupt events in a non-interrupt buffer; flushing the interrupt data to a secondary buffer, the flushing not occurring during interrupt handling time; and flushing the non-interrupt data to the secondary buffer, the flushing not occurring during interrupt handling time.
Another aspect of the present invention provides a data packet adapted to be transmitted from a first system to a second system, the data packet comprising data used in logging operating system events.
Another aspect of the present invention provides a system for logging operating system kernel events, comprising: means for logging interrupt events to an interrupt buffer; means for logging non-interrupt events to a non-interrupt buffer; means for copying data in the interrupt buffer to a secondary buffer; and means for copying data in the non-interrupt buffer to the secondary buffer.
To the accomplishment of the foregoing and related ends, certain illustrative aspects of the invention are described herein in connection with the following description and the annexed drawings. These aspects are indicative, however, of but a few of the various ways in which the principles of the invention may be employed and the present invention is intended to include all such aspects and their equivalents. Other advantages and novel features of the invention may become apparent from the following detailed description of the invention when considered in conjunction with the drawings.