1. Field of the Invention
The present invention relates to file server systems, and in particular, to a method and apparatus for providing a multi-cast based scheme to provide a single system image in a cluster-based network-attached file server.
2. Description of the Related Art
The ability to manage massive amounts of information in large scale databases has become of increasing importance in recent years. Increasingly, data analysts are faced with ever larger data sets, some of which measure in gigabytes or even terabytes. Further, the increase of web traffic to many popular web sites has lead to the design of cluster-based web servers to manage the data and traffic.
In a cluster-based server environment, two or more servers that work together are clustered. Clustering provides a way to improve throughput performance through proper load balancing techniques. Clustering generally refers to multiple computer systems or nodes (that comprise a central processing unit (CPU), memory, and adapter) that are linked together in order to handle variable workloads or to provide continued operation in the event one computer system or node fails. Each node in a cluster may be a multiprocessor system itself. For example, a cluster of four nodes, each with four CPUs, would provide a total of 16 CPUs processing simultaneously.
In a clustered environment, the data and management tasks may be distributed across multiple nodes or servers that may communicate with each other. Each node maintains a data storage device, processor, etc. to manage and access a portion of the data that may or may not be shared.
Further, to address high demands on data sharing and management, some systems utilize network-attached file servers (NAFS). NAFSes allow easy file sharing through standard protocols, such as NFS (network file system) and CIFS (common internet file system). The consolidation of a large amount of data in a file server simplifies management tasks. Furthermore, NAFSes are often designed with a single purpose, i.e., file serving. Accordingly, NAFSes can potentially provide a high level of performance, reliability, and availability. However, most of today's commercial NAFS systems are built from individual workstations. Limited hardware resources present an impediment to system scalability.
NAFSes may be combined in a cluster to provide a cluster-based NAFS system. A cluster-based NAFS requires a cluster-based file system (or parallel file system). Such a file system provides single naming space to allow file accesses using a global naming structure, such as an AFS (a distributed file system)-like directory structure. Additionally, to attain high performance, a parallel file system makes efficient use of parallelism within a cluster. The prior art may attempt to provide such an NAFS system. For example, in an NAFS environment, two-node cluster file servers for fall-over purposes may be provided. However, such a 2-node cluster system does not provide a single system image.
Several existing single system image (SSI solutions are used in a cluster-based web server environment. For example, the use of a network address translation (NAT) provides one potential solution. The growing number of internet hosts may eventually cause the shortage of unique IP addresses. NAT is an Internet technology designed to solve this problem. NAT allows multiple hosts connected on a private network to share the same IP address.
A range of possible SSI solutions are implemented and performed on the client-side of a transaction in a web environment. A direct way to achieve scalability and SSI is to instrument client side software to perform load balancing among cluster nodes. For instance, a client browser may be permitted to choose appropriate server nodes in the cluster without the client's knowledge. However, few companies provide such an option. Alternatively, an applet-based client side approach may be used to provide scalable accesses to servers. Under an applet-based approach, an applet runs on a client computer, collects load information, and carries out web accesses based on the gathered statistics. Such client-side approaches may provide high performance, but are not client-transparent.
Another range of possible SSI solutions are implemented and performed on the server-side of a web-server transaction. For example, one server-side SSI solution may use a round-robin DNS (domain name system) approach. Under a DNS approach, a DNS on the server side dynamically maps a cluster host name to different internet protocol (IP) addresses of the cluster nodes. The drawback of a DNS-based approach is that the name-IP translation may be cached in several name servers, and workload distribution changes can only be done rather statically.
In another server-side web-server approach, a TCP (transmission control protocol) router may be used. Using a TCP router, one of the cluster nodes serves as the router and dispatcher of the network packets. Clients only see the router IP address. The client requests always arrive at the router first. The router dispatches the request to other cluster nodes based on the observed workloads on each cluster node. When a cluster node replies to a client request, the node rewrites the network packet header with router's address and sends the request directly to the client without going through the router any more. However, the use of the router may cause a bottleneck in the processing of client requests.
In another possible server-side SSI solution (referred to as a routing-based dispatching scheme), a centralized dispatcher distributes workload to the cluster nodes. However, similar to the TCP router approach, the dispatcher can become the potential performance bottleneck. In another server-side SSI solution (referred to as a broadcast-based dispatching scheme), the routers between clients and the cluster nodes are configured to route client packets to the cluster nodes as Ethernet broadcast packets. A special device driver is installed on each cluster node to filter the client packets. The IP-level routing may increase router workload. Additionally, manual configuration of routers is required.
The above approaches provide SSI in a web server environment. However, such solutions have not been proposed or used in an NAFS environment. There are significant differences between a web server environment and an NAFS environment: First, the cluster nodes in a web server can be very loosely-coupled, or even independent of each other, while the cluster nodes in a file server must be tightly-coupled. This is because the web servers' workloads are mostly read-only. Accordingly, individual web server nodes can cache web pages aggressively and satisfy http requests without worrying about cache-coherency problems.
A parallel file system, however, may need to exchange information fairly frequently among the nodes to ensure file cache-coherency. Further, parallel file systems require high-performance interconnects to connect cluster nodes. Such an interconnection requirement does not exist for cluster-based web servers.
Secondly, a cluster-based file server is expected to deliver much higher bandwidth than a cluster-based web server. For instance, Cisco's local Director claims to deliver 24 MB/s throughput and handle 1,000,000 TCP connections at one time while a 2-node Network Appliances' filer can deliver around 60 MB/s throughput when tested with industry standard Netbench tests.
SSI solutions for a file server must consider such requirements. Accordingly, what is needed is a cluster-based network attached file server system that provides a parallel file system that may exchange information frequently and that may deliver information at a higher bandwidth than a cluster-based web server and presents an SSI to a client.