Both supply and demand for computer disk storage have increased sharply in the last decade. Computer hard-disk technology and the resulting storage densities have grown even faster than in the semiconductor field, at times exceeding Moore's Law projections. Despite application-program bloat and wider use of large multimedia files, disk drive storage densities have been able to keep up.
Widespread use of networking has allowed for greater file sharing. Rather than store data files on individual users' personal computers (PCs), many companies and organizations prefer to have important files stored in a centralized data storage. Important data can be regularly maintained and archived while being available for sharing among many users when stored in central storage.
Storage Area Networks (SANs) are widely used by large corporations as such a centralized data store. FIG. 1 shows a SAN. SAN bus 22 is a high-speed bus such as a FibreChannel that connects to many disk arrays 20, 20′, 20″. Often Redundant Array of Independent Disks (RAID) is used to mirror data stored on the disks for greater reliability. RAID controllers 18, 18′, 18″ can be inserted between disk arrays 20, 20′, 20″ and SAN bus 22.
Clients such as application layer 10 generate disk accesses through client operating system 12, which in turn uses file system 14 to access stored data. FibreChannel controller 16 receives requests and performs physical transfers of block data over SAN bus 22.
SAN bus 22 transfers data as blocks rather than as files. File system 14 converts requests for data files into the block addresses needed for accessing the data on disk arrays 20, 20′, 20″. An individual file can be stored on several blocks of data that are stored on any of disk arrays 20, 20′, 20″.
Since SAN bus 22 performs block transfers, there is no high-level file-system controller on disk arrays 20, 20′, 20″. Instead, disk arrays 20, 20′, 20″ act as network-connected disk drives. SAN bus 22 can operate with special protocols that are designed to maximize data transfer rates of lower-level blocks of data. Thus SAN bus 22 is a specialized bus. SANs tend to be very expensive and are usually not cost-effective for smaller organizations.
Another widespread storage technology is Network Attached Storage (NAS). NAS is less expensive than SAN and uses standard Transport-Control-Protocol Internet Protocol (TCP/IP) network buses rather than a specialized SAN bus. NAS appliances are usually single-box devices such as LINUX or Windows systems that can easily be connected to a TCP/IP network.
FIG. 2 shows use of NAS and the NAS creep problem. Network bus 32 is a standard TCP/IP network bus. Client application layer 10 requests file accesses through operating system 12, which uses network-file-system (NFS) 28 to generate file request messages that are encapsulated by TCP/IP and any lower-level network protocols and sent over network bus 32 to NAS appliance 21.
NAS appliance 21 processes the message received over network bus 32 using server NFS 26, which sends the request to file system 14. File system 14 looks up the file name and finds the file handle, which is returned to application layer 10. Future requests for the data can be made by using this file handle.
File system 14 can access requested data through small-computer system interface (SCSI) controller 24 or another kind of controller to access disk arrays 20. RAID controllers 18 may be used for redundant data storage, or may be omitted.
NAS appliance 21 is easy to install and maintain, since it is connected to network bus 32 much like any other networked PC. Network bus 32 carries NFS messages encapsulated by standard TCP/IP packets, which can be the already-existing network in an organization that the client PCs are already attached to. File names and file handles are transferred over network bus 32 in the NFS messages rather than block addresses.
One problem with NAS appliance 21 is upgrading to larger storage or faster processing capacities. As its workload increases, the processor on NAS appliance 21 may no longer be able to handle all requests. Additional disks may initially be added to NAS appliance 21, but once all disk bays are populated, no more disks can be added. Instead, a second NAS appliance 21′ may need to be installed on network bus 32. New data could be stored on NAS appliance 21′. However, it is difficult to move data among NAS appliance 21 and NAS appliance 21′. This is because clients have to mount to NAS appliance 21′ as well as to NAS appliance 21′. Additional mount requests to NAS appliance 21′ have to be added to startup scripts for all clients. Data moved to NAS appliance 21′ is found on a different mount point, so clients have to use the new location to find the data. NAS appliance 21 and NAS appliance 21′ appear as two different file systems, with different mount names. Each NAS appliance 21, 21′ has its own name space.
It is very undesirable to have to change client software to reflect the new path to NAS appliance 21′ when a file is moved from NAS appliance 21 to NAS appliance 21′. Users may have to know that the file is now accessible under a different mount-point. Applications, scripts, and shortcuts must be edited to refer to the new mount-point. It can be an administrative nightmare to notify users and change existing software and scripts when a file is moved from one name space to another. Thus moving the file is not transparent to the client users.
This is known as the NAS creep problem. It occurs when the NAS appliance 21 fills up and files have to be moved to a different NAS appliance 21′, or when an additional server is installed for performance or other reasons.
FIG. 3 shows SAN combined with NAS. Client application layer 10 still sends NFS messages over network bus 32 to the NAS server, which has NFS 26 decode the messages and use file system 14 to look up file handles. Data requests for file data are converted to block addresses and sent over SAN bus 22 by FibreChannel controller 16. The data is accessed by disk arrays 20, 20′, 20″ and returned as block data that must be re-arranged to generate the data file.
While using a SAN with a NAS could allow for expandability, SAN bus 22 is often a FibreChannel or other very expensive bus, negating the cost advantage of NAS. Protocols such as FibreChannel or iSCSI may be used. FibreChannel Hardware tends to be expensive, while iSCSI tends to be expensive to maintain with the required drivers, etc. A SAN virtualization switch may also be needed.
Some client file systems may allow for directories to reside on different servers in one virtual space. The client file system transparently redirects client accesses to different physical servers when moving to a directory on a different server. However the granularity of re-direction is the directory rather than the file. File granularity rather than directory granularity is preferable for most file-systems.
What is desired is a Network Attached Storage (NAS) system that is expandable and upgradeable and does not use a SAN bus. File-level granularity and virtualization is desirable without altering file systems on existing NAS servers. A virtual NAS is desired that has a compact table storing meta-data on a per-file basis.