1. Field of the Invention
The present invention relates to distributed message passing systems in client-server networks implemented across a local area network or a wide area network.
2. Description of the Related Art
One of the uses for network management applications is to asynchronously monitor the performance of network operations, including support of client-server applications across a local area network or a wide area network. The network management applications provide information related to network performance by sending notifications or messages in response to detecting prescribed events that may affect a network resource. For example, a network management application monitoring a router performance may monitor the CPU utilization in a router; if the CPU utilization reaches a certain threshold, for example, 80%, the network management application may send a message to a monitoring application to notify that the router CPU is busy, indicating a potential problem. Hence, network management applications are important in maintaining network stability and robustness.
Client-server computing is becoming more advanced and sophisticated, with improved hardware infrastructure, high-speed, low-cost, wide-area bandwidth and distribution of computer applications across multiple computers connected by local area and wide area networks. These advanced networks use distributed objects for processing of distributed transactions across the wide-area network. An exemplary implementation of a distributed object architecture is the Common Object Request Broker Architecture (CORBA) 2.0 specification adapted in December 1994 by the Object Management Group (OMG). The CORBA specification creates interface specifications, written in a neutral Interface Definition Language (IDL), that defines a component""s boundaries and enables objects to interoperate in heterogeneous client-server environments.
However, as systems become more complex, it becomes increasingly difficult to recognize and trace the progress of specific events, messages, and the like, as they flow through a system. For example, assume that a developer of a network application wishes to monitor the occurrence of events detected at a source application, also referred to a source process. The source process registers with an event distribution system that distributes messages received from the source process, including messages indicating the occurrence of the event. Consumer applications, also referred to as consumer processes, that are interested in obtaining information about a certain event register their interest with the event distribution system by specifying a prescribed filtered condition. Hence, the event distribution system, upon receiving a message from a source process, executes the filter process to determine whether the consumer process should receive the message corresponding to the occurrence of the event from the source process. The event distribution system selectively passes the message to the registered consumer process based on the message satisfying the prescribed filtered condition. Alternatively, the consumer process may periodically poll the event distribution system for events.
However, the monitoring for messages may be substantially complex if the event distribution system may divided into a cascaded group of multiple processes, where the message corresponding to an occurrence of an event passes through multiple filters as the message flows through the system. In particular, at any one point, the message may not be passed by the filter, but instead may be dropped or rejected by the filter. If the message is dropped due to an error in the filter, then substantial debugging efforts may be necessary in order to correct the faulty filter.
Prior attempts at tracing a progress of a message path have had limited effectiveness. For example, a trace route type function might be used to determine the path, or flow from process to process. However, multiple messages may be flowing from hundreds or thousands of different source processes throughout the system, creating a time consuming problem in attempting to identify a specific message or a specific filter. Hence, use of a trace route type function to determine the path or flow from process of a generic message may result in a tedious and laborious debugging process.
An alternative approach involves setting a specific trace debug option on each process and examining the trace output to determine how incoming messages are processed. For example, certain systems trace each and every event processed by each and every process within a distributed system. In this case, each process has its own associated log for storing the result of operation by the corresponding process on the message. In this case, however, a programmer would need to combine all the logs of the different processes within the distributed system and correlate the logs together to attempt to identify what process handled what message, and at what step in the message flow. Hence, substantial efforts would be necessary to identify the processes operating on a given message, locate the log entry for that message in the corresponding log, compile all the log entries from the different processes, and determine the appropriate order of the log entries relative to the path of the message.
There is a need for an arrangement that enables efficient tracing and monitoring of the progress of an event message as the message passes through a distributed multiple process event distribution system.
There is also a need for an arrangement that enables centralized monitoring of selected messages as they pass through a distributed multi-process system, regardless of the message path or the results of the operations performed on the message by any of the associated processes.
These and other needs are attained by the present invention, where a source process sets a tracing bit in a message that is to be traced as it passes throughout a distributed multi-process system. Hence, each process receiving the message determines whether the received message has the tracing bit set, and in response outputs a trace message, enabling the message to be traced throughout the system. A centralized logging process may be used for collecting the trace messages for the monitoring of the progress of the message throughout the different processes.
According to one aspect of the present invention, a system is provided for monitoring a progress of an event. The system includes a source process configured for generating a message corresponding to an occurrence of an event, the source process selectively setting a tracing bit in the message and outputting the message for reception by a destination consumer process. The source process also outputs a trace message specifying the outputting of the message by the source process. The system also includes an event distribution system having a distributed plurality of filter processes configured for selectively passing the message for reception by the destination consumer process based on respective prescribed filter conditions. Each of the filter processes having received the message outputs a corresponding trace message in response to detection of the tracing bit, indicating whether the message was passed by the corresponding filter process for monitoring the progress of the message. Setting of the tracing bit by the source process enables the filter process and any subsequent processes to identify the need to generate trace messages specifying the operation performed by the corresponding process. Hence, the path of the message throughout the multiple processes in the distributed multi-process system can be readily determined by a consumer process.
Another aspect of the present invention provides an event distribution system. The event distribution system includes a source process interface for receiving a message corresponding to an occurrence of an event from a source process, and a distributed plurality of filter processes configured for selectively passing the message for reception by a destination consumer process based on respective prescribed filter conditions. Each filter process having received the message outputs a corresponding trace message specifying whether the message is passed, in response to detecting a tracing bit set in the message, for tracing a progress of the message throughout the event distribution system.
Still another aspect of the present invention provides a method of tracing a progress of a message between a source process and a destination consumer process. The method includes receiving a message from the source process corresponding to an occurrence of an event in the source process, and selectively passing the message by a distributed plurality of filter processes based on respective prescribed filter conditions, for reception by the destination consumer process in each of the filter processes having received the message. The method also includes, in each of the filter processes having received the message, outputting a trace message specifying whether the message is passed based on the corresponding prescribed filter condition, in response to detecting a tracing bit set in the message.
Additional advantages and novel features of the invention will be set forth in part in the description which follows, and in part will become apparent to those skilled in the art upon examination of the following or may be learned by practice of the invention. The advantages of the invention may be realized and attained by means of the instrumentalities and combinations particularly pointed out in the appended claims.