Present day computer clusters are typically geographically collocated. Such clusters are constituted by a large number of nodes that are operated in a way to render the cluster highly available. In particular, it is important that certain files served by nodes of the cluster remain highly available.
The high level of their availability should be maintained even in the case of node failures and other adverse conditions.
In a typical high availability cluster, each one of the nodes has associated resources. These resources commonly include storage media on which files are stored. The media may be different, in other words, the media may be heterogeneous. The files residing on these diverse storage media are served to network clients under the direction of a master node. The service usually involves various types of requests, including typical read/write requests to those files.
The prior art has addressed the challenge of keeping certain files highly available by corresponding caching methods. For example, U.S. Pat. No. 6,442,601 to Gampper et al. describes a caching system and method for migrating files retrieved over a network from a server to a secondary storage. Gampper's system and method optimize migration of files from the primary storage to the secondary storage. They also provide a cache to store files retrieved from the server based on a distribution of file requests per unit of time and according to file size for files maintained in the primary storage and the secondary storage. Although this approach is helpful on the client side, it does not address the issues of suitable storage management on the side of a modern highly available cluster with heterogeneous storage media designed to maintain certain files highly available.
U.S. Pat. No. 7,558,859 to Kasiolas et al. describes a peer-to-peer auction strategy for load balancing across the data storage nodes with the aid of a cluster manager for each cluster of a data center. Although this teaching does address in general the automatic re-balancing of data in clusters, it does not address appropriate load balancing when the data storage nodes have available to them heterogeneous storage media. In particular, the teaching does not address situations in which the different types of storage media differ drastically in cost and performance, such as access-rate performance.
In another approach taught by Chatterjee et al. in U.S. Pat. No. 7,996,608 a RAID-style system is taught for providing redundancy in a storage system. This approach is applicable to a storage cluster that stores data on storage nodes. The cluster manages storage between these storage nodes by defining zones and mirroring each storage node to another storage node for failure resistance. Maps are used to determine which blocks of a file are allocated to which zones, and blocks may be migrated between zones during remapping. This system ensures that minimum replication levels are maintained and that load balancing occurs periodically during remapping. Still, this teaching also does not address situations in which heterogeneous storage media that differ drastically in cost and performance are deployed in the storage nodes of the cluster.
The prior art also contains still other references of interest. For example, U.S. Pat. No. 8,006,037 to Kirshenbaum et al. addresses caching issues and data migration issues in clusters. U.S. Pat. Application 2011/0208933 addresses a storage system composed of volatile storage (DRAM) and non-volatile (disk) storage. U.S. Pat. Application 2011/0252192 addresses storing objects onto flash devices or hard drives backed with non-volatile RAM (NVRAM). In the latter system, sequential Input/Output (I/O) such as a transaction log is moved to the hard drives with NVRAM to increase the lifetime of flash devices. Flash devices are used for essentially random I/O. It should be noted that the most recent generation of flash devices has vastly increased lifetime even when handling a large amount of I/O throughput.
What is important herein, is not implementing an efficient object store, but rather handling bulk data files effectively in clusters with heterogeneous storage media that may include flash devices and hard disk drives.
Melamant et al. is another work that describes two-tier storage hierarchies, supporting a system with a high-reliability/high-performance tier and a high-reliability/low-performance tier that is substantially cheaper and offline. In a distributed systems approach, however, one must address issues of disk, device, and machine failure, including implications for re-replication and load balancing. In addition, Melamant does not address load balancing and migration between storage tiers on a whole-cluster, rather than whole-machine basis, as well as dealing with storage that is supposed to stay online and available.
In fact, despite the many useful methods and protocols that have been made available, the prior art is still lacking. Specifically, it does not provide an effective method for optimally deploying heterogeneous storage media associated with nodes of a high availability cluster to maintain a high availability of certain files that are very popular.