A cluster is a multi-node computer system, in which each node comprises a server running on a computer, which may be a server blade. Clusters function as collectively operational groups of servers. The nodes, also called members, of a cluster function together to achieve high server system performance, availability and reliability. For a cluster to function properly, time is synchronized between its nodes. In a clustered database for instance, time synchronicity between members can be significant in maintaining transactional consistency and data coherence.
To achieve time synchronism between cluster members, the clock of one or more cluster members may be adjusted with respect to a time reference. In the absence of an external time source such as a radio clock or a global timeserver, computer based master election processes select a reference “master” clock for cluster time synchronization. A typical master election process selects a cluster member as a master and sets a clock associated locally with the selected member as a master clock. The clocks of the other cluster members are synchronized as “slaves” to the master reference clock.
During cluster operation, the clocks of one or more members may run faster or slower than that of the master. Where the clock of a cluster member runs slower than the master clock, synchronizing that member's clock to that of the master involves advancing the member's clock. Advancing a member's clock is performed sequentially and gradually. Such gradual synchronizing adjustment is rarely problematic. However, when the clock of a cluster member is significantly ahead of the master clock, that clock may be stepped backwards.
Stepping back a clock can lead to confusion in running applications and the operating system (OS). In a real application cluster (RAC) or another cluster of database servers and applications for instance, confusion may arise due to data incoherence, transactional inconsistency and other issues relating to the timing mismatch within the cluster. Such confusion can cause applications to crash, fail or otherwise more or less suddenly cease functioning, and/or the OS to hang or otherwise suspend its normal functioning.
Moreover, on computer clusters where a significant number of member clocks must be stepped back, this situation may be exacerbated. This situation can adversely affect issues relating to the availability, reliability and/or performance of the cluster. Running a mission critical and/or distributed application can become problematic in a cluster so affected.
Based on the foregoing, it could be useful to preclude stepping member clocks backwards to synchronize clock time in a cluster.
The approaches described in this section are approaches that could be pursued, but not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section.