Many of today's network attached storage (NAS) systems are being managed by storage virtualization systems. Storage virtualization systems monitor the data objects stored on a first NAS system (“source,” “source server” or “src”), and can synchronize, migrate, and/or copy the data objects (“storage virtualization operations”) to a second NAS system (“destination,” “destination server” or “dst”). The benefit of using a storage virtualization system for performing such operations is that it does not require bringing either NAS system offline. In other words, a client computer may still access data objects on the source or destination NAS system while the storage virtualization system is performing its operations. The storage virtualization system may be placed as an intermediary between the client computer that generates or accesses the data objects, and the source NAS server that stores the data objects. In some instances, when the client computer requests access to the source NAS server, the storage virtualization system may cause the client computer to be redirected to the destination NAS server without the client computer's intervention or knowledge. As such, the storage virtualization system is considered transparent.
There are many reasons for performing storage virtualization operations. For example, an organization using one NAS system may decide to incorporate a second NAS system in order to reduce load on the first NAS system. A storage virtualization system can facilitate migration of data objects from the source NAS system to a destination NAS system. In another example, an organization may want to keep a second NAS system as a real-time backup of a first NAS system. A traditional backup system would require locking out user access in order to ensure that the data objects are not accessed or altered during backup. Storage virtualization does not require such locking out, thereby reducing downtime and maintaining productivity. One will appreciate that there are a number of other ways to benefit from storage virtualization.
Client computers and NAS systems typically communicate with one another over a network, though they may also be directly connected. Effective communication requires that the instructions and commands passed from the client computer to its associated NAS system are recognized and understood. One skilled in the art will appreciate that client computers and NAS systems may communicate using one of many network file system protocols. Each protocol has its own specifications, instructions and conventions.
Two of the most commonly used protocols include the Network File System (“NFS”) protocol, and the Common Internet File System (“CIFS”) protocol. In a typical NAS system, a NFS-compatible client computer (such as a UNIX, Linux, Mac OS or other UNIX-variant computer) will communicate with a NFS NAS system using NFS commands, and a CIFS-compatible client computer (such as a Windows® computer) will communicate with a CIFS NAS system using CIFS commands. Generally, NFS computers do not interact with CIFS computers, and CIFS computers do not interact with NFS computers. As a result, it is not common for a NFS-compatible client computer to be associated with a CIFS NAS system, or for a CIFS-compatible client computer to be associated with a NFS NAS system.
There are, however, some NAS systems known as “mixed-protocol” or “NFS+CIFS” NAS systems capable of recognizing NFS and CIFS protocols. As a result, both NFS-compatible client computers and CIFS-compatible client computers may associate with a mixed-protocol NAS system. Despite these capabilities, NFS+CIFS NAS systems will typically adhere to one protocol or the other, but not both. If data objects are stored on mixed-protocol NAS system using both protocols, NFS data objects and CIFS data objects may be separated to prevent confusion.
In order to effectively perform its operations, a storage virtualization system must be able to use the proper protocol when interacting with its associated NAS systems. Presently, a storage virtualization system handling data objects between a NFS client computer and a NFS NAS system will use the NFS protocol. Similarly, a storage virtualization system handling data objects between a CIFS client computer and a CIFS NAS system will use the CIFS protocol. Because the two protocols are not cross-compatible, most storage virtualization systems used today will only use the protocol of its associated NAS systems. A storage virtualization system may have a NFS-compatible aspect and a CIFS-compatible aspect. When a storage virtualization system is associated or connected to a NAS system, it must be configured to use the protocol of the NAS system. However, as organizations incorporate NAS systems of differing protocols, it becomes difficult to configure the storage virtualization system for each associated NAS system, especially when the storage virtualization system is associated with mixed protocol NFS+CIFS NAS systems. As a result, storage virtualization systems placed as an intermediate between NFS-compatible client computers and CIFS-compatible client computers that are associated with a NFS+CIFS NAS system will encounter problems arising from the differences between the protocols.
There are many known differences between the NFS and CIFS protocol. One known difference is the treatment of symbolic links, also known as “soft links” or “symlinks.” A symlink is a certain file type that points or forwards to the location of another associated file or directory (called the “symlink target” or “target”). A symlink target may reside in the same directory, file system or computer as the symlink, or a symlink may point to a target stored in another mounted file system. One skilled in the art will appreciate that a symlink is not directly “linked” to its target, but rather, the symlink contains or is associated with the path address for its target (called “target path information” or “target path location”). The target path information may be an absolute path address, a relative path address, or any other type of data that may represent a path address. The target path information may be stored as metadata or other data associated with the symlink. When a user removes a symlink, the target to which it pointed remains unaffected. However, moving or deleting the symlink's target will “break” the symlink since the location to where it pointed will no longer exist. If a new symlink target replaces another symlink target, but uses the same name, the symlink will continue to point to the new target. The symlink makes no distinction between old or new targets, so long as the target's name and path address is the same. A person having ordinary skill in the art will appreciate that the process of forwarding access from the symlink to the symlink's target is known as “symlink expansion.”
Symlinks are unique to NFS-compatible UNIX-based client computers and NFS NAS systems. If the symlink is located on a UNIX client computer, the UNIX operating system is responsible for expanding a symlink and forwarding access to the target. On the other hand, if the symlink is located on a NFS NAS system or NFS+CIFS NAS system, then the NAS system or NAS system software may perform symlink expansion. One will also appreciate that a symlink may appear as another file or data object when displayed in a file directory, but may have a special icon, symbol, file attribute or property associated with it to denote that it is a symlink, rather than a data file with data content. On the other hand, CIFS-compatible client computers and CIFS NAS systems do not inherently recognize or use symlinks. Symlinks are not part of the CIFS-compatible client computer operating system and so symlinks do not natively exist on these types of computers. As a result, symlink expansion will not be performed on a CIFS-compatible client computer or a CIFS NAS system.
Because symlinks are recognized in NFS and UNIX, storage virtualization operations between a NFS or UNIX-based client computer, source server and destination server will not break symlinks. Thus, the storage virtualization system will not interfere with the client computer's or NAS system's ability to perform symlink expansion. Additionally, if the storage virtualization system is performing storage virtualization operations between an NFS source server and an NFS destination server, no special steps are necessary to ensure that symlinks and symlink targets are transferred from the source to the destination.
Conversely, issues may arise when a mixed protocol NAS system stores symlinks and their associated targets. Because a mixed protocol NAS system may store data objects using the NFS protocol as well as data objects stored using the CIFS protocol, a mixed protocol NAS system may contain symlinks created by an associated NFS-compatible client computer. This does not present a problem for the NFS-compatible client computer, because symlinks are inherently recognized and may be expanded by the NFS-compatible client computer's operating system. However, a CIFS-compatible client computer associated with the mixed protocol NAS system may have issues with these symlinks. To a CIFS-compatible client computer accessing the mixed protocol NAS system, symlinks may appear in the NAS system's file directory, but they may appear to be regular data objects. A CIFS-compatible client computer has no inherent way to distinguish a symlink from any other data file with data content. Thus, when the CIFS-compatible client computer tries to access a symlink stored on the mixed protocol NAS system, the symlink will not be expanded by the client computer's operating system because symlink expansion is not normally performed by this type of operating system.
To compensate for the CIFS-compatible client computer operating system's inability to perform symlink expansion, the mixed protocol NAS system may perform symlink expansion, because the NFS aspect of the mixed protocol NAS system can recognize and expand symlinks. In other words, should the CIFS-compatible client computer request access to a data object that is actually symlink, the NFS aspect of the NFS+CIFS NAS system may be able to determine whether the requested data object is a symlink. It further may be able to discover the symlink's target path information and forward access to the target. This operation may be automatic and transparent such that the CIFS-compatible client computer is not made aware that the NAS system has expanded the symlink.
As noted previously, a storage virtualization system tasked with performing storage virtualization operations on a NAS system will be configured to use the protocol of the associated NAS system and client computer. If the client computer is CIFS-compatible, and the NAS system uses the CIFS protocol, then the storage virtualization system will be configured to use the CIFS protocol. If the client computer is CIFS-compatible, and the NAS system uses the NFS+CIFS protocol, then the storage virtualization system will still be configured to use the CIFS protocol, because it is likely that data objects stored on the NFS+CIFS NAS system will have been stored using the CIFS protocol. However, data objects may also have been stored on the NFS+CIFS NAS system using the NFS protocol, since the NFS+CIFS NAS system may have previously been associated or may still be associated with a NFS-compatible client computer. As a result, the NFS+CIFS NAS system may contain symlinks.
In such a case, the NFS+CIFS NAS system may not have any issue with expanding symlinks, since as described previously, the NFS+CIFS NAS system can use the NFS protocol to expand symlinks for a CIFS-compatible client computer. However, if the storage virtualization system tasked with managing the NFS+CIFS NAS system is configured to use the CIFS protocol, it will not recognize which data objects are symlinks, and which data objects are data files with data content. As a result, when the storage virtualization system uses the CIFS protocol to migrate or copy data objects from a NFS+CIFS source server to a NFS+CIFS destination server, the symlinks may not properly migrate. Similarly, if the storage virtualization system uses the CIFS protocol to synchronize data objects between a NFS+CIFS source server and a NFS+CIFS destination server, symlinks may not properly synchronize. The symlink target path information stored as metadata or file attributes associated with the symlinks will not migrate from the source to the destination or properly synchronize between the source and destination, because the CIFS protocol used by the storage virtualization system will not recognize this information. One will appreciate that a direct download of data objects from one NFS+CIFS server to another NFS+CIFS server may preserve symlinks, since both of the servers will be able to recognize and expand symlinks. However, since such a download would require taking both servers offline, and since organizations prefer to use a storage virtualization system to perform such operations, it becomes important that the storage virtualization system recognize and handle symlinks on the NFS+CIFS NAS system.
What is therefore needed is a way to ensure that symlinks are properly identified on a mixed protocol NFS+CIFS NAS system. What is further needed is a way to ensure that if any symlinks are present on the source NAS system, that they are properly migrated, copied or synchronized to a destination NAS system.