Appliances, such as storage servers, often handle heavy load and overload in a “non-linear” and uncontrolled fashion. When a storage server is operated under a heavy load, performance can rapidly deteriorate, with unexpected interactions between different storage system components.
Traditional concept associated with performance control is Quality of Service (QOS). QOS typically indicates the ability to insure a certain performance level under certain conditions. Providing this type of service level guarantees may require an upgrade of a storage system with additional hardware, which sometimes is not an acceptable approach.
Some existing systems attempt to optimize performance of a storage server automatically to achieve maximum throughput. However, in some complex multi-function environments reduced latency may be valued over throughput, while in other environments it may be the opposite. Where multiple workloads are competing for resources such as CPU utilization or disk access, some workloads (e.g., requests associated with purchase orders database) may be considered as mission critical by a user, while other workloads (e.g., requests associated with employees' home directories) may be considered as capable of tolerating higher latency. Existing systems do not allow prioritizing of a workload with respect to storage system resources based on a class of a workload.
In one embodiment, workload classification may include client work (where a request is initiated by a user) and application work (where workload is initiated by a uniquely addressable requesting entity). A situation may be encountered by storage system users where user-initiated requests (e.g., user operations over protocols such as NFS, CIFS, FCP or iSCSI on a single storage entity) are competing with system-initiated requests (application work) for resources. System-initiated requests include operations performed while generating a snapshot (i.e., an image of the active file system at a point in time, a consistency point CP), or while performing mirror operations, where data are periodically mirrored to other systems (e.g., while performing snapmirror operations). It should be noted that “snapshot” is a trademark of Network Appliance, Inc. and is used for purposes of this patent to designate a persistent consistency point (CP) image. “Snapmirror” is a trademark of Network Appliance, Inc, which is used for purposes of this patent to designate operations, where data are periodically mirrored to other systems. On a busy system, snapmirror or similar system operation may cause an undesirable impact to user operations, either increasing latency or decreasing I/Os per second (IOPS) beyond levels acceptable to the administrator. In some environments, it may be desirable that user-generated requests are disrupted as little as possible during CP or snapmirror operations, while in other environments, users may desire that system-initiated requests are unhindered by user-generated requests. Existing systems do not allow prioritization of user-initiated requests with respect to system-generated requests.