A storage server is a computer system that is used to store and retrieve data on behalf of one or more clients on a network. A storage server typically stores and manages data in a set of mass storage devices, such as magnetic or optical storage-based disks or tapes. In conventional network storage systems, the mass storage devices may be organized into one or more groups of drives (e.g., redundant array of inexpensive drives (RAID)).
A storage server may be configured to service file-level requests from clients, as in the case of file servers used in a Network Attached Storage (NAS) environment. Alternatively, a storage server may be configured to service block-level requests from clients, as done by storage servers used in a Storage Area Network (SAN) environment. Further, some storage servers are capable of servicing both file-level and block-level requests, as done by certain storage servers made by NetApp®, Inc. of Sunnyvale, Calif.
Conventionally, a storage server equally divides its available system resources, e.g., CPU, memory, etc, among processes serving inbound and outbound data requests. For example, assuming a storage server has 1000 kbps (Kilobit per second) throughput available for network communications, if there are 10 simultaneous processes transferring data in or out of the storage server via the network, the storage server may allocate 100 kbps of network bandwidths to each process. When there are 100 simultaneous data transferring processes, then each process can be allocated with 10 kbps of the network bandwidth, which probably would be considered insufficient for network communications. Further, the more simultaneous processes there are, the more task switching the storage server has to perform, which further lowers the performance of the storage server. Thus, when there is no limitation on the number of simultaneous processes that are allowed to run on a storage server, resource bottlenecks tend to occur.
For data protection purposes, data stored in a primary storage server is often backed-up or mirrored to one or more remote secondary storage servers. The replicated data in the secondary storage servers can later be used for restoring data on the primary storage server, if needed. Since these backup and mirroring processes transfer a large amount of data across a network, they typically require a large amount of system resources, and run for a long period of time. However, in a typical system resource arrangement, these data protection processes are not treated any differently from the other processes. Consequentially, a limited number of simultaneous running data protection processes can occupy most, if not all, of the system resources, leaving little to the storage server for serving other data requests. Thus, without limiting allocation of system resource to some of the resource-thirsty processes, resource bottlenecks can become major performance concerns.
Once a system resource becomes bottlenecked, it takes longer for each of the simultaneous running processes that require the resource, especially the ones that are resource-hungry and long durational, to finish using the resource. To further complicate matters, if a backup or mirroring process fails pre-maturely because it cannot finish quickly enough, any data that has already been transferred to the secondary storage servers may no longer be trustworthy, and may be unusable for recovery purposes. Thus, when there is no restriction as to the number of simultaneous processes and to the amount of system resources that can be allocated to each process, a disaster could have a bigger impact on the data reliability provided by the storage servers than there otherwise would be. If the replicated data is corrupted due to lack of resources, the system resources previously allocated for transferring the data have also been wasted.