A storage system typically comprises one or more storage devices into which information may be entered, and from which information may be obtained, as desired. The storage system includes a storage operating system that functionally organizes the system by, inter alia, invoking storage operations in support of a storage service implemented by the system. The storage system may be implemented in accordance with a variety of storage architectures including, but not limited to, a network-attached storage (NAS) environment, a storage area network (SAN) and a disk assembly directly attached to a client or host computer. The storage devices are typically disk drives organized as a disk array, wherein the term “disk” commonly describes a self-contained rotating magnetic media storage device. The term disk in this context is synonymous with hard disk drive (HDD) or direct access storage device (DASD). It should be noted that in alternative embodiments, the storage devices may comprise solid state devices, e.g., flash memory, battery backed up non-volatile random access memory, etc. As such, while this description is written in terms of disks, those embodiments should be viewed as exemplary only.
The storage operating system of the storage system may implement a high-level module, such as a file system, to logically organize the information stored on volumes as a hierarchical structure of data containers, such as files and logical units. For example, each “on-disk” file may be implemented as set of data structures, i.e., disk blocks, configured to store information, such as the actual data for the file. These data blocks are organized within a volume block number (vbn) space that is maintained by the file system. The file system may also assign each data block in the file a corresponding “file offset” or file block number (fbn). The file system typically assigns sequences of fbns on a per-file basis, whereas vbns are assigned over a larger volume address space. The file system organizes the data blocks within the vbn space as a “logical volume”; each logical volume may be, although is not necessarily, associated with its own file system.
A known type of file system is a write-anywhere file system that does not overwrite data on disks. If a data block is retrieved (read) from disk into a memory of the storage system and “dirtied” (i.e., updated or modified) with new data, the data block is thereafter stored (written) to a new location on disk to optimize write performance. A write-anywhere file system may initially assume an optimal layout such that the data is substantially contiguously arranged on disks. The optimal disk layout results in efficient access operations, particularly for sequential read operations, directed to the disks. An example of a write-anywhere file system that is configured to operate on a storage system is the Write Anywhere File Layout (WAFL®) file system available from NetApp, Inc., Sunnyvale, Calif.
The storage system may be further configured to operate according to a client/server model of information delivery to thereby allow many clients to access data containers stored on the system. In this model, the client may comprise an application, such as a database application, executing on a computer that “connects” to the storage system over a computer network, such as a point-to-point link, shared local area network (LAN), wide area network (WAN), or virtual private network (VPN) implemented over a public network such as the Internet. Each client may request the services of the storage system by issuing file-based and block-based protocol messages (in the form of packets) to the system over the network.
A plurality of storage systems may be interconnected to provide a storage system environment configured to service many clients. Each storage system may be configured to service one or more volumes, wherein each volume stores one or more data containers. Yet often a large number of data access requests issued by the clients may be directed to a small number of data containers serviced by a particular storage system of the environment. A solution to such a problem is to distribute the volumes serviced by the particular storage system among all of the storage systems of the environment. This, in turn, distributes the data access requests, along with the processing resources needed to service such requests, among all of the storage systems, thereby reducing the individual processing load on each storage system. However, a noted disadvantage arises when only a single data container, such as a file, is heavily accessed by clients of the storage system environment. As a result, the storage system attempting to service the requests directed to that data container may exceed its processing resources and become overburdened, with a concomitant degradation of speed and performance.
One technique for overcoming the disadvantages of having a single data container that is heavily utilized is to stripe the data container across a plurality of volumes configured as a striped volume set, where each volume is serviced by a different storage system, thereby distributing the load for the single data container among a plurality of storage systems. One technique for data container striping is described in the above-incorporated U.S. Publication No. US2005-0192932, entitled STORAGE SYSTEM ARCHITECTURE FOR STRIPING DATA CONTAINER CONTENT ACROSS VOLUMES OF A CLUSTER. Typically, when the striped volume set is first generated, each of the constituent nodes servicing the constituent volumes of the striped volume set utilizes the same or similar generation technology. That is, each node typically comprises the same or substantially the same hardware and/or software configurations. Thus, the nodes may be viewed as homogeneous as each is substantially identical to each other. A noted disadvantage of such systems arises when a striped volume set is expanded at a later point in time and a customer uses later (i.e., newer) generation hardware and/or software for the newly added nodes. As the newly added nodes utilize the most up to date hardware and/or software, they typically have additional computational power as compared to the original nodes of a cluster. More generally, this problem may be noted when any heterogeneous cluster is formed, that is, when the nodes of a cluster utilize systems having substantially different functionality and/or processor capabilities. In such heterogeneous systems, each node is typically utilized an equal amount for striping operations. The noted disadvantage arises as later generation nodes may have additional processor capabilities that remain under or unutilized. The advantage of using new and/or faster nodes is thus wasted as nodes are utilized, in effect, as if they were homogeneous to the least powerful node of a cluster, i.e., the original nodes. To avoid such a waste of processing power, a user must ensure that all nodes are of a common homogeneous type. This may be accomplished by, for example, purchasing older nodes or by replacing the older nodes with newer models. Both of these solutions are not optimal and raise the total cost of ownership of a clustered storage system.