Clusters are groups of computers that use groups of redundant computing resources in order to provide continued service when individual system components fail. More specifically, clusters eliminate single points of failure by providing multiple servers, multiple network connections, redundant data storage, etc. Absent clustering, if a server running a particular application fails, the application would be unavailable until the server is restored. In a clustering system, the failure of a server (or of a specific computing resource used thereby such as a network adapter, storage device, etc.) is detected, and the application that was being run on the failed server is automatically restarted on another computing system (i.e., another node of the cluster). This process is called “failover.”
Clustering systems are often combined with storage management products that provide additional useful features, such as journaling file systems, logical volume management, multi-path input/output (I/O) functionality, etc. Where a cluster is implemented in conjunction with a storage management environment, the computer systems (nodes) of the cluster can access shared storage. The shared storage is typically implemented with multiple underlying physical storage devices, which are managed by the clustering and storage system so as to appear as a single storage device to computer systems accessing the shared storage.
An individual node of a cluster can use a non-shared, local read cache. For example, the local cache can be in the form of a solid state drive (SSD) using fast integrated circuit based memory. The node can use its local cache for read caching of shared storage, which can significantly decrease latency. However, each such cache is local to the individual node and not shared between nodes in the cluster, whereas the shared storage is global to the cluster and shared between multiple nodes. Therefore, a node can erroneously read stale data from its local cache after a cluster based event affecting shared storage such as a failover, if cached blocks of shared storage are modified by another node of the cluster.
It would be desirable to address this issue.