Various forms of network storage systems exist today, including network attached storage (NAS), storage area networks (SANs), and others. Network storage systems are commonly used for a variety of purposes, such as backing up critical data, data mirroring, providing multiple users with access to shared data, etc.
A network storage system typically includes at least one storage server, which is a processing system configured to store and retrieve data on behalf of one or more client processing systems (“clients”) that are used by users of the network storage system. In the context of NAS, a storage server is commonly a file server, which is sometimes called a “filer”. A filer operates on behalf of one or more clients to store and manage shared files. The files are stored in a non-volatile mass storage subsystem (which is typically external to the storage server, but does not have to be) which may include one or more arrays of non-volatile mass storage devices, such as magnetic or optical disks or tapes, by using RAID (Redundant Array of Inexpensive Disks). Hence, the mass storage devices in each array may be organized into one or more separate RAID groups.
In a SAN context, a storage server provides clients with access to stored data at a sub-file level of granularity, such as block-level access, rather than file-level access. Some storage servers are capable of providing clients with both file-level access and block-level access, such as certain Filers made by Network Appliance, Inc. (NetApp®) of Sunnyvale, Calif.
One area which is gaining significant attention in relation to network storage technology is information lifecycle management (ILM). ILM can be defined as the practice of applying certain policies to the effective management of information throughout its useful life. ILM includes every phase of a “record” from its beginning to its end. A “record” in this context can be any kind of data or metadata stored in non-tangible form (e.g., electronically or optically stored data). ILM is based on the premise that the uses and usefulness of any given item of information are likely to change over time. ILM therefore involves dividing the lifecycle of information into various stages and creating an appropriate policy or policies to handle the information in each stage. ILM further involves providing the technology infrastructure to implement those policies, such as data storage technology. Operational aspects of ILM include data backup and protection; disaster recovery, restore, and restart; archiving and long-term retention; data replication; and day-to-day processes and procedures necessary to manage a storage architecture.
Three issues that ILM needs to address for the modern business enterprise are legal compliance, data security (privacy) and economics (cost). U.S. federal regulations and other forms of law mandate that certain types of information generated in the course of operating a business be retained, unmodified, for certain periods of time and be discoverable and available. Such records-retention regulations include, for example, Securities Exchange Commission (SEC) Rule 17a-4 (17 C.F.R. §240.17a-4(f)), which regulates broker-dealers; Health Insurance Portability and Accountability Act (HIPAA), which regulates companies in the healthcare industry; Sarbanes-Oxley (SOX), which regulates publicly traded companies; 21 C.F.R. Part 11, which regulates certain companies in the life sciences industry; and, DOD 5015.2-STD, which regulates certain government organizations; etc. Affected businesses therefore must adopt ILM policies, practices and technology infrastructure to comply with these laws.
As to data privacy, the prevalence of identity theft and electronic fraud in recent years as well as corporate piracy and trade secret theft make it critical for businesses to protect their customers' confidential data (e.g., social security numbers, birthdates, bank account numbers) and their own confidential data (e.g., intellectual property and other sensitive information).
These issues must be addressed in a cost effective manner, with respect to the cost of the technology infrastructure used to implement ILM and the need for information technology (IT) support staff to maintain that infrastructure and to train and assist employees in using it. Various technologies exists today to implement different aspects of ILM. For example, hierarchical storage management (HSM) has been used to provide multi-tiered storage architectures, which allow older data and infrequently-used data to be offloaded to relatively inexpensive (but slower) storage. Businesses often use some form of write-once/read-many (WORM) data storage facility to store data for legal compliance. Further, business sometimes encrypt sensitive data to address privacy concerns. However, no technology is known to date that provides a complete, beginning-to-end, cost-effective ILM solution for network data storage.