Unless otherwise indicated, the approaches described in this section are not necessarily prior art to the claims in this application, and are not admitted to be prior art merely by inclusion in this section.
A cluster is a type of distributed computing system that consists of a collection of interconnected whole computers (nodes) used as a single, unified computing resource. A node is a computer system running a single Operating System (OS) instance. When significant or interesting events occur that change the state of a cluster, this can directly or indirectly affect the applications and services provided by the cluster.
An event, within the context of the present invention, denotes a change in system state, configuration or any other parameter of interest. Users and/or programs that access the computer system often have an interest in these events or in the data, resident in a database system, affected by these events. System users are typically interested in knowing about any system changes that could affect them so that they can take action, if necessary. For example, users and/or programs accessing a database may need notification upon the occurrence of specific system events, such as database startups or shutdowns, when the system is running out of disk space or roll back segments, or on the occurrence of logons and logoffs. Likewise, users could need notification on the occurrence of specific data events, such as when inventory for an item falls below a critical threshold so that items can be ordered in a timely manner. In each case, action can be taken based on such system or data related information extracted or otherwise obtained from the database or computer system.
To improve scalability, databases and file systems may permit more than one database or file server (which may each be running separately) to concurrently access shared storage such as disk media. Each database or file server has a cache for caching shared data. In many software applications, such as databases, processes frequently need to notify other processes that certain events have occurred. Events serve as wake-up mechanisms, and typically, one or more processes may be blocked waiting for an event to occur. Applications that handle a significant number of events are dependent on delivery information and the efficiency of event notification.
Conventionally, most event notification systems are used as part of specific applications or in localized settings, such as a single machine. However, in clustered computing environments, there are many different messages exchanged between nodes for various reasons. For example, in a distributed database system, processes in different nodes will send query parameters and return query results. Part of a query may be executing in one node, and another part of the query may be executing in a different node. In busy clustered computing environments, there are potentially thousands of messages sent between nodes per second. In addition, event notifications are also sent across nodes.
Current event notification techniques present other problems in clustered computing environments as well. For example, whenever an event occurs, a broadcast message is sent to all of the nodes. When a node receives the broadcast message it can notify any process that may be blocked by the event by signaling semaphores using the state saved in the shared-memory within the node. This approach works only if message traffic is low, which is not usual in clustered computing environments. A drawback to sending separate broadcast messages for event notification for all nodes in a distributed system is that the cost of communication for event notification between nodes is high compared to the cost of event notification within one node.
Based on the foregoing, it is desirable to provide improved techniques for efficient event notification in a clustered computer system. These techniques should exploit the characteristics of a clustered system and of event notification in order to provide an efficient event notification in clustered computing environments. It is also desirable to provide techniques that enable applications that rely on event notifications to be ported transparently to clustered computing systems.