Modern businesses rely almost exclusively on electronic data storage for preserving and utilizing their necessary business operating information. Electronic data storage is useful only if the data can be reliably preserved and quickly accessed. If the data can not be preserved against inadvertent loss, the available information may not be accurate and can not be used to operate the business in a trustworthy manner. Quickly accessing accurate data is also essential because the pace of modern business requires immediate answers and responses in order to be effective. Thus, both reliable preservation of the data and immediate access to the data are essential.
Modern mass data storage systems and methods recognize the essential nature of reliable preservation and immediate access to the data. The typical approach to ensuring both characteristics is to employ functional features generally referred to as redundancy. In the context of mass data storage, redundancy involves the capability of accessing or creating a duplicative copy of the data, should some unforeseen event occur which makes a primary copy of the data unreliable or inaccessible, and doing so relatively quickly.
Redundancy is incorporated in essentially every level and aspect of modern mass data storage systems. For example, an entirely separate mass data storage system which maintains a secondary duplicate copy of the data may be established at a remote site, with frequent updates to the duplicate copy to reflect changes in the primary copy of the data at the primary site. Under these circumstances, if a catastrophic event should occur at the primary site, the remote site supplies the services that would normally be supplied by the primary site. As another example, the data storage computers (servers or filers) which provide access to the data storage units, typically hard disk drives (disks) or solid-state memories, and which perform data read and data write operations, are organized in such a way that if one server fails, another server may rapidly assume responsibility for performing the read and write operations to the data storage units connected to the failed server. This arrangement of paired-together servers is sometimes referred to as a server failover pair. As a further example, the server failover pairs may be organized together in clusters containing multiple servers and server failover pairs. The servers in the cluster are interconnected through networks both internal to the cluster and external to the cluster which allow the servers in the cluster to communicate with other servers in the cluster, and thereby obtain rapid access to data which would otherwise be unavailable in the event of a failure.
Redundancy is also important in communicating with a mass data storage system. A user operates a user computer, typically referred to as a client, and the client communicates requests for data services to a server of the mass data storage system. The server responds to the client requests by performing data read and write operations and then responds to the client. In this manner the server serves the client by responding to the client requests.
In order to perform data service transactions in a client-server relationship, there must be a reliable communication path between the client and server. The client and the server are almost always located remotely from one another, so the communication path may be relatively short, for example within an office building, but it is typically a much greater distance, for example across a city, state or country, or even between different continents. The typical client-server communication path is through a computer communication network, such as a local area network if the client and server are located relatively close to one another, or over a public or private access computer communication network, such as the Internet if the client and server are located at substantially greater geographic distances.
Data communications between computers through the computer communication network rely on the use of internet protocol (IP) addresses. Each computer has an IP address associated with it. Use of the IP addresses assure that the communications are addressed to and received by the two computers which are the participants in the communication. Each computer includes a network interface adapter, commonly referred as a port, through which the messages or packets are communicated over the computer communication network. This network interface adapter transmits the IP address of the computer sending the communication, accepts only those communications addressed to the IP address of the receiving computer, and ignores all communications that are not addressed to the receiving computer. The network interface adapter is a hardware component to which a cable or other physical medium is attached. The signals from the computer communication network are received and transmitted on the cable.
One redundancy concern in mass data storage systems is a possibility that the cable through which the signals are transmitted and received by the network interface adapter will become severed, disconnected, or otherwise nonconductive or unavailable due to some unforeseen event. Under such circumstances, network communications between the client and the server become impossible since there is no signal path to or from the network interface adapter. To address this concern in mass data storage systems, techniques have been developed to move the IP address from the network interface adapter connected to the failed communication path to another network interface adapter which is connected to a fully functional communication path. Moving the IP address from one interface adapter to another allows continued client-server communication due to the redundancy capabilities among the servers in the cluster, even when the original communication path has become inoperative.
The ability to move the IP address between different network adapters is typically referred to as transferring a virtual interface (VIF). The virtual aspect arises because the IP address is not exclusively associated with a single network interface adapter. In the context of moving IP addresses to be hosted by different and functional network interface adapters, it is also typical to refer to the connections to the network interface adapters as ports. Thus, a VIF involves both the combination of an IP address and a port which hosts that IP address. In the case of a VIF, the IP address does not change, but the port posting that IP address may change, thereby changing the VIF.
The aspects of moving the VIF from one port or server to another port or server in a mass storage data system are described in the above-identified US patent applications. The technique described in these patent applications involves manually writing a set of failover configuration rules. Under circumstances where a port is recognized as inoperative, the VIF associated with the inoperative port is transferred to another operative port of the server in the cluster. The failover configuration rules define the manner of attempting a transfer to other ports. If none of the ports of the failover configuration rules are available, the VIF cannot be transferred.
The failover configuration rules previously required manually programming to invoke those rules with respect to each port of each server of the mass data storage system. In modern mass data storage systems, there are multiple servers in a cluster, e.g. 24 servers, and each of the servers may have multiple ports, e.g. 16, associated with it. Thus in this example, up to 384 ports (24×16) may be available to host IP addresses. However, to assure that the failover will occur, the failover configuration rules must be specifically programmed to specify each of these 384 ports to which the port may failover. The failover configuration rules are specific programmed instructions which define each other failover port and the sequence to be followed in attempting to failover to each other failover port.
To assure redundancy, it is not unusual for each port to have as many as 13 separate ports to which it may fail over to, and under the circumstances of this example, almost 5000 (24×16×13) separate failover rule instructions must be programmed for the mass data storage system. An individual person known as the system administrator must manually program these failover instructions. This amount of programming becomes labor-intensive and tedious, and consequently becomes prone to mistakes. However, without this manual programming, VIF failover redundancy is not possible or fully effective.
Another difficulty associated with individually programming all of the failover configuration rules is that changes in the size of the mass data storage system, such as by adding additional servers or server failover pairs in the cluster, requires additional programming of failover configuration rules to accommodate failovers to and from the added servers. The pre-existing failover configuration rules must be modified to interact with the newly-added servers, thereby requiring some degree of change to all of the pre-existing failover configuration rules. Otherwise, delays, restrictions or bottlenecks in reading and writing data would occur.