As the value and use of information continues to increase, individuals and businesses seek additional ways to process and store information. One option available to users is information handling systems. An information handling system generally processes, compiles, stores, and/or communicates information or data for business, personal, or other purposes thereby allowing users to take advantage of the value of the information. Because technology and information handling needs and requirements vary between different users or applications, information handling systems may also vary regarding what information is handled, how the information is handled, how much information is processed, stored, or communicated, and how quickly and efficiently the information may be processed, stored, or communicated. The variations in information handling systems allow for information handling systems to be general or configured for a specific user or specific use such as financial transaction processing, airline reservations, enterprise data storage, or global communications. In addition, information handling systems may include a variety of hardware and software components that may be configured to process, store, and communicate information and may include one or more computer systems, data storage systems, and networking systems.
In this regard, RAID, an acronym for Redundant Array of Independent Disks, is a technology that provides increased storage functions and increased reliability through redundancy, and as such may be beneficially employed in information handling systems. Redundancy in a RAID device may be achieved by combining multiple disk drive components, which may include one or more disks of different type, size, or classification, into a logical unit, where data is distributed across the drives in one of several ways called “RAID levels.” The data distribution determines the RAID type, e.g., RAID 0, RAID 5, RAID 10, etc.
RAID includes data storage schemes that can divide and replicate data among multiple physical disk drives. The physical disks are said to be in a RAID array, which is addressed by the operating system as one single disk. Many different schemes or architectures of RAID devices are known to those having ordinary skill in the art. Each different architecture or scheme may provide a different balance among various goals to be achieved in storing data, which include, but are not limited to, increased data reliability and increased input/output (hereinafter “I/O”) performance.
Storage systems may use disk drives comprising shingled media recording (SMR) disks. As known in the art, SMR disks may record data using overlapping write tracks. Physical characteristics of the disk heads may result in the write head being larger than the read head. This overlapping nature of writes often requires that a zone be written from beginning to end to avoid destroying the data on surrounding tracks.
From a disk management point-of-view, SMR makes random access writes perform poorly or not at all depending on the variety of SMR disk. SMR disks do not perform well with fully random write workloads. I/O initiators must control the write pattern to maximize performance for both writes and reads.
SMR disks may incur write amplification when a portion of the data is written within a zone. Unmodified data within the zone must move and be re-written in order for a zone to be written. Such write amplification consumes resources of a disk, the storage connection, and/or the memory bandwidth of a disk controller. Ideally, write amplification occurs within a disk. Write amplification at the connection (e.g., Serial Attached Small Computer System Interface connection) and controller level may limit overall system performance when new data replaces old data in zones. For example, for RAID 6, write amplification sizes may approach a zone size (e.g., 256 MB) times the number of disks in a RAID stripe (e.g., 10 for RAID 6-10 disks). Thus, updating a small amount of data may require the movement of over 2 GB of data to update each data disk in a stripe plus any parity data. Thus, it may be desireable to isolate write amplification within a system to allow for scalability and provide for the best overall system performance.
Existing approaches for zone alignments in a storage system do not take into account SMR zones or the effect of write amplification. Stripes under existing approaches, and therefore page alignments under existing approaches, may spread out across a wide set of disks to maximize parallelism for concurrent requests. Sequential page numbers may likely use a completely different set of disks. Also within any number of disks supported in a RAID configuration, sequential pages on a disk may not be the same for other disks in the stripe. Such alignment makes it difficult to determine adjacent pages on SMR disks. RAID stripe alignments must allow for a simple determination of adjacent pages within a zone.
Existing disk management approaches may allocate pages assuming that all pages may be equally written at any time. In other words, such approaches may assume full random write access to any page at any time. However, SMR disks may require grouped or sequential writes to maximize performance capabilities. A disk management system, therefore, must control page allocations and writes must occur in a coordinated manner to effectively use SMR disks.