A network storage server is a processing system that is used to store and retrieve data on behalf of one or more clients on a network. A storage server operates on behalf of one or more clients to store and manage data in a set of mass storage devices, such as magnetic or optical storage-based disks or tapes. Some storage servers are designed to service file-level requests from hosts, as is commonly the case with file servers used in a network attached storage (NAS) environment. Other storage servers are designed to service block-level requests from clients, as with storage servers used in a storage area network (SAN) environment. Still other storage servers are capable of servicing both file-level requests and block-level requests, as is the case with certain storage servers made by NetApp, Inc. of Sunnyvale, Calif.
To service a large-scale system with high throughput requirements, multiple storage servers can be connected together to form a storage cluster. A cluster architecture provides improved scalability. In a cluster architecture, each storage server is called a storage server node, or simply a “node”. A storage cluster typically has multiple network addresses, such as Internet Protocol (IP) addresses, any of which can be accessed by a client to service a request. A node in a storage cluster has one or more ports, and IP addresses can reside on one or more of those ports.
In one known storage cluster architecture, a client sends a request to a designated storage server node via a particular port and IP address. That storage server node may service the request itself, or it may forward the request to another node in the cluster. An example of a cluster-oriented network storage system that has these features is the Data ONTAP GX system from NetApp.
Two protocols commonly used by clients to access data over a storage network are network file system (NFS) and common internet file system (CIFS). With NFS or CIFS, to initially gain access to stored data, a client “mounts” a network share by accessing an IP address. To mount a share, a client either specifies an IP address directly or it provides a host name which a DNS (domain name server/service) translates into an IP address. The client then establishes a connection to that IP address and sends a mount request/command in a well-known format to mount the share. Once the share has been mounted with a particular IP address, that IP address is used by the client until the share is unmounted.
In a storage cluster, the network traffic can become unbalanced across the various IP addresses of the cluster as a storage cluster operates. Ideally, when a client is about to mount a share, it would choose the IP address available for that share which is on the least loaded port/node in the cluster. However, many storage cluster architectures do not provide a client with any help to choose the IP address which is on the port/node with the smallest load. Others do so in a way which is not optimal.
The most commonly used DNS on the Internet today is the Berkeley Internet Name Domain (BIND). By default, BIND provides a round-robin mechanism to select from a list of IP addresses when given a particular DNS zone name. When a request comes in to resolve a name, BIND will return a list of all IP addresses that can resolve that name. However, it will rotate the order of the entries in the list. If the client receives the list in that order, and always selects the first entry, then client mounts should be balanced.
However, when using BIND, the user's DNS infrastructure may reorder the list between the time it leaves the server and arrives on the client. Such reordering can erroneously cause all clients to choose the same IP address to mount. Furthermore, even if BIND hands out each of the IP addresses evenly and they are perfectly balanced, that may still result in unbalanced loading, since some ports (on which the IP addresses reside) may have more capacity available than others.
BIND also provides a mechanism for using service (“SRV”) records to do load balancing, including the ability to specify statically the probability of returning a given node. Specifically, an SRV record can associate a hostname with a priority value that represents the priority of the target host, and a weight value that represents a relative weight for records with the same priority. However, SRV records cannot be dynamically updated in response to a given load. Furthermore, at present they are an experimental feature that are not used to resolve the DNS requests that most users use.
Isilon Systems provides a DNS with a feature called SmartConnect, which is described as having the ability to perform client connection load balancing. The SmartConnect approach is understood to maintain a single DNS server within a cluster. As such, if the node operating the DNS fails, then the name service fails. Also, the ability to handle increased DNS load does not scale with the addition of more nodes to the cluster. Further, this approach is partially based on returning round-robin results which, depending on the implementation, can necessitate a large amount of communication between different nodes in a cluster.
Another known load-balancing name server is “Ibnamed”. With Ibnamed, weights are applied to IP addresses and are dynamically updated based on loading. The lowest weighted IP address is always returned. However, the Ibnamed solution also uses a single DNS to adjust the weights after each request, which undesirably provides a single point of failure. In addition, the load calculation used by Ibnamed makes weighting decisions based on parameters such as load averages, total users, unique users, boot time, current time, etc. These parameters may be suitable for load balancing workstation utilization, but are not suitable for balancing individual port utilization.
Another known DNS is “TinyDNS”. The TinyDNS server returns a random list of eight servers which can fulfill the requests. TinyDNS provides a random return, but it does not allow certain IP addresses to be weighted. The decision to return an IP address is binary, i.e., it is returned or it is not.