Data is a critical asset for companies. Without access to data, companies may not be able to provide their customers with the level of service desired. When it comes to a data storage, businesses find themselves facing issues, such as the requirements that their storage solution be cost effective, easy to operate, as well as being capable of growing alongside their storage needs. Network Attached Storage (NAS) has rapidly become popular with enterprises and small businesses in many industries as an effective, scalable, low cost storage solution, and as a convenient method of sharing files among multiple computers.
The Network Attached Storage device is a storage server attached to a computer network that allows storage and retrieval of data from a centralized location (archives, databases, etc.) for authorized network users and heterogeneous clients.
Network Attached Storage removes the responsibility of file serving from other servers on the network. They typically provide access to files using a number of network file sharing protocols.
Although the internet protocol IP is a common data transport protocol for NAS architectures, some mid-market NAS products may support the network file system (NFS), internet or packet exchange (IPX) and ETBIOS extended user interface (NETBEUI) or common internet file system (CIFS) protocols. High-end NAS products may support gigabit Ethernet (GigE) for even faster data transfer across the network.
In the NAS architecture, corporate information resides in a storage system that is attached to a dedicated server, which, in turn, is directly connected to a network, and uses a common communication protocol, such as TCP/IP. In a corporate team structure, the NAS operates as a server in a typical client-server environment. The NAS may be connected to a network by standard connectivity options such as Ethernet, FDDI, and ATM. In some cases, a single specialized NAS server can have up to 30 Ethernet connections.
Clustered NAS, i.e., the NAS that uses a distributed file system running simultaneously on multiple servers is gaining popularity. The clustered NAS, similar to a traditional NAS, provides unified access to the files from any of the cluster nodes, unrelated to the actual location of the data.
NAS devices, which typically do not have a keyboard or display, are configured and managed with a browser-based utility program. Each NAS resides on the computer network (for example, LAN) as an independent network node and has its own IP address.
An important benefit of NAS is its ability to provide multiple clients on the network with access to the same files. When more storage capacity is required, the NAS appliance can simply be outfitted with larger disks or clustered together to provide both vertical scalability and horizontal scalability. Many NAS vendors partner with cloud storage providers to provide customers with an extra layer of redundancy for backing up files.
Some higher-end NAS products can hold enough disks to support RAID which is a storage technology that turns multiple hard disks into one logical unit in order to provide better performance times, high availability, and redundancy.
Recently, the baseline functionality of NAS devices has broadened to support virtualization. High-end NAS products may also support data deduplication, flash storage, multi-protocol access, and replication.
FIG. 1 is representative of a Network-Attached Storage topology where clients, servers, and the Network-Attached Storage are interconnected into a local area network (LAN). The devices (clients, NAS, servers) are considered nodes in the LAN which may be interrelated through the Ethernet switch.
NAS is not always suitable for applications such as data warehouses and on-line transaction processing, since these applications need to sustain high I/O data rates with little (or zero) degradation in response times to the clients.
Another disadvantage of existing NAS data migration strategies is that during data movement associated with migrating data to specific node-local storage devices, all the data (bulk data must be migrated which is a significant source of application jitter requiring involvement of a high volume of network resources.
Specifically, with a pool of network attached storage, a conventional protocol of data caching assumes moving all data into a single storage pool that is globally accessed by any client. This approach typically provides no means for compartmentalizing (aka isolating) the data from other users or for resource reservations (such as storage capacity, latency/bandwidth) without additional external tools (such as quota managers, as well as quality of service tools built into the NAS).
With the physical node-local storage, the conventional protocol tries to make a best effort in data placement when moving data to the client hosts. Due to the fact that it is difficult to accurately predict a specific client for the application executing, the data is possibly placed on a wrong client host. In this case, protocol is burdened with the necessity to move data from the incorrect host to a correct host, thus creating unnecessary extra date migration load which is detrimental to the NAS system performance.
The removal of unnecessary bulk data movement across the storage tiers in NAS devices and decoupling the data placement tasks provided by the caching layer (NAS servers) from the data access pattern invoked by the storage system clients and applications, would be highly beneficial for the otherwise tremendously useful NAS technology.