Technical Field
This application relates to managing data statistics collection for data migration in data storage systems.
Description of Related Art
A traditional storage array (herein also referred to as a “disk storage array”, “disk array”, or simply “array”) is a collection of hard disk drives operating together logically as a unified storage device. Storage arrays are designed to store large quantities of data. Storage arrays typically include one or more storage array processors (SPs), for handling both requests for allocation and input/output (I/O) requests. An SP is the controller for and primary interface to the storage array.
Storage arrays are typically used to provide storage space for one or more computer file systems, databases, applications, and the like. For this and other reasons, it is common for storage arrays to logically partition a set of disk drives into chunks of storage space, called logical units, or LUs. This enables a unified storage array to provide the storage space as a collection of separate file systems, network drives, and/or Logical Units.
Performance of a storage array may be characterized by the array's total capacity, response time, and throughput. The capacity of a storage array is the maximum total amount of data that can be stored on the array. The response time of an array is the amount of time that it takes to read data from or write data to the array. The throughput of an array is a measure of the amount of data that can be transferred into or out of (i.e., written to or read from) the array over a given period of time.
The administrator of a storage array may desire to operate the array in a manner that maximizes throughput and minimizes response time. In general, performance of a storage array may be constrained by both physical and temporal constraints. Examples of physical constraints include bus occupancy and availability, excessive disk arm movement, and uneven distribution of load across disks. Examples of temporal constraints include bus bandwidth, bus speed, spindle rotational speed, serial versus parallel access to multiple read/write heads, and the size of data transfer buffers.
One factor that may limit the performance of a storage array is the performance of each individual storage component. A storage system may include a variety of storage devices that balance performance and cost objectives. Different types of disks may be arranged whereby the like kinds of disks are grouped into tiers based on the performance characteristics of the disks.
For example, a fast tier (also referred to as “higher tier” or “high tier”) may include a group of very fast solid state drives (SSDs) used to store a relatively small amount data that is frequently accessed. A medium tier (also referred to as “mid tier” or “middle tier”) may include a group of fast hard disk drives (HDD) used to store a larger amount of less frequently accessed data but at a lower performance level that SSDs. A slow tier (also referred to as “lower tier” or “low tier”) may include a group of slower HDDs used to store very large amounts of data with a still lower level of performance as compared to SSDs and fast HDDs. It may be possible to have different tiers with different properties or constructed from a mix of different types of physical disks to achieve a performance or price goal. Storing more frequently accessed or “hot” data on a fast tier and less frequently referenced accessed or “cold” data on a slow tier may create a more favorable customer cost/performance profile than storing all data on a single kind of disk. To provide data protection, tiers may be arranged in a variety of RAID (Redundant Array of Independent or Inexpensive Disks) configurations known in the art.