1. Field of the Invention
The present invention is related to messaging between nodes in a storage area network and, more particularly, to nodes controlled by different operating systems.
2. Description of the Related Art
A storage area network (SAN) provides direct, high-speed physical connections, e.g., Fibre Channel connections, between multiple hosts and disk storage. The emergence of SAN technology offers the potential for multiple computer systems to have high-speed access to shared data. However, the software technologies that enable true data sharing are mostly in their infancy. While SANs offer the benefits of consolidated storage and a high-speed data network, existing systems do not share that data as easily and quickly as directly connected storage. Data sharing is typically accomplished using a network filesystem such as Network File System (NFS™ by Sun Microsystems, Inc. of Santa Clara, Calif.) or by manually copying files using file transfer protocol (FTP), a cumbersome and unacceptably slow process.
The challenges faced by a distributed SAN filesystem are different from those faced by a traditional network filesystem. For a network filesystem, all transactions are mediated and controlled by a file server. While the same approach could be transferred to a SAN using much the same protocols, that would fail to eliminate the fundamental limitations of the file server or take advantage of the true benefits of a SAN. The file server is often a bottleneck hindering performance and is always a single point of failure. The design challenges faced by a shared SAN filesystem are more akin to the challenges of traditional filesystem design combined with those of high-availability systems.
Traditional filesystems have evolved over many years to optimize the performance of the underlying disk pool. Data concerning the state of the filesystem (metadata) is typically cached in the host system's memory to speed access to the filesystem. This caching—essential to filesystem performance—is the reason why systems cannot simply share data stored in traditional filesystems. If multiple systems assume they have control of the filesystem and cache filesystem metadata, they will quickly corrupt the filesystem by, for instance, allocating the same disk space to multiple files. On the other hand, implementing a filesystem that does not allow data caching would provide unacceptably slow access to all nodes in a cluster.
Systems or software for connecting multiple computer systems or nodes in a cluster to access data storage devices connected by a SAN have become available from several companies. EMC Corporation of Hopkington, Mass. offers HighRoad file system software for their Celerra™ Data Access in Real Time (DART) file server. Veritas Software of Mountain View, Calif. offers SANPoint which provides simultaneous access to storage for multiple servers with failover and clustering logic for load balancing and recovery. Sistina Software of Minneapolis, Minn. has a similar clustered file system called Global File System™ (GFS). Advanced Digital Information Corporation of Redmond, Wash. has several SAN products, including Centra Vision for sharing files across a SAN. As a result of mergers the last few years, Hewlett-Packard Company of Palo Alto, Calif. has more than one cluster operating system offered by their Compaq Computer Corporation subsidiary which use the Cluster File System developed by Digital Equipment Corporation in their TruCluster and OpenVMS Cluster products. However, none of these products are known to provide direct read and write over a Fibre Channel by any node in a cluster. What is desired is a method of accessing data within a SAN which provides true data sharing by allowing all SAN-attached systems direct access to the same filesystem. Furthermore, conventional hierarchal storage management uses an industry standard interface called data migration application programming interface (DMAPI). However, if there are five machines, each accessing the same file, there will be five separate events and there is nothing tying those DMAPI events together.
Some filesystems are used in networks that permit heterogeneous operating systems to be connected together. For example, Tivoli® SANergy™ enables multi-OS nodes to access a SAN. As another example, NFS™ uses a common wire format that is big endian to communicate between systems that may run Solaris™ from Sun Microsystems, Inc. on SPARC® processors which are also big endian, or UNIX, LINUX or Windows® NT® on Intel® processors which are little endian. As a result, each system has to convert from their internal data format to the common wire format. Typically, not much change is required by Solaris™ systems since SPARC® processors are big endian. However, since Intel® processors are little endian, conversion by systems running on Intel® processors have to do a lot more work to convert to common wire format. If Solaris™ system is communicating with a system running on Intel® processors, conversion will be necessary, but if two systems running on Intel® processors are communicating with each other over an NFS™ network, both systems will convert to or from common wire format, even though the receiving system might be able to easily understand the data from the transmitting system with little or no conversion if they were communicating directly.
In addition, typical SAN filesystems perform data conversion at a fairly high layer of the communication path between processes. This has an advantage in that there may be more information available regarding the context of messages that need to be converted. However, this results in dispersing the program code responsible for data conversion making maintenance of the code difficult.