Conventional applications requiring reliable data storage typically use commonly available RAID (redundant array of independent disks) levels to protect against data loss due to media or disk failures. Although conventional RAID technology has proven to be useful, it would be desirable to present additional improvements. With a marked rise in the quantity of stored data and no commensurate improvement in disk reliability, a variety of RAID codes are becoming essential to containing data management costs. However, rolling out new RAID codes is challenging as well as cost prohibitive since the new RAID codes require significant development, testing, and tuning efforts.
Until recently, protecting customer data from loss due to media failure and/or device failures meant storing the using one of a basic set of RAID codes representing various levels of data protection and storage performance. To handle higher performance and reliability needs of customers, storage vendors deployed additional codes (e.g., RAID 51) as variations on that basic set. These additional codes were primarily offered as a result of juggling the inherent risk-reward trade-off from a software engineering standpoint as opposed to improving storage efficiency or performance. These additional codes can be composed by reusing (e.g., hierarchically) the basic RAID set, thus minimizing the amount of additional source code introduced. Consequently, product-marketing needs were satisfied with a low testing expense.
Only a few RAID codes were supported in traditional RAID implementations (firmware) for a variety of reasons: firmware complexity, software maintainability, and field upgrade difficulty. Firmware complexity, as measured by the number of paths that need to be tested, grows with every supported RAID code. Increased complexity increases development and test costs. The firmware becomes a collection of special cases making it hard to perform path length optimizations. In addition, performance tuning becomes difficult, if not prohibitively expensive.
From a software maintainability standpoint, a collection of “if . . . then . . . else . . . ” code blocks makes firmware readability harder, making the firmware more prone to bugs. Each rollout of a RAID code requires field upgrades. Upgrading firmware and drivers is not a task that a storage administrator relishes, given the propensity of upgrades to trigger other problems.
Since deploying firmware changes is painful, there is a general mindset to avoid it all costs. However, recent trends in storage technology and customer focus are forcing a re-evaluation of this status quo because of a need for supporting a variety of RAID codes in a system, growth in reference data, a growing popularity of modular systems, and a growing use of low cost serial disks to build high performance systems.
No single RAID code satisfies all aspects of data storage. In a world where the volume of customer data is rising due to various reasons from business transformation to regulatory requirements, it is becoming increasing important to store data at levels of reliability, performance and efficiency that are proportional to the “business” value of the data. Information life-cycle management (ILM) is becoming crucial to cost containment. Supporting a variety of RAID codes thus becomes integral to effective information life-cycle management.
The additive nature of reference data in organizations implies that the same storage manages a wide variety of scales of data sets from gigabytes to petabytes at the same reliability level. Using a single RAID code in this application is challenging. These datasets typically span many disks, indicating a relatively high probability of simultaneous failures. Furthermore, while disk capacities are increasing, the hard error rate (HER) due to media deterioration, channel errors, etc., has remains relatively constant. Combined, these trends raise the frequency of data loss events as data sets grow unless RAID codes with a higher fault tolerance are used.
The use of modular (or brick) systems to build storage systems that scale capacity, reliability and performance simultaneously is a trend that is gaining popularity. Many of these systems embrace deferred maintenance to reduce the cost of managing such systems. In a “fail in place” service strategy, software isolates failed components and rebuilds stored data on surviving bricks. As long as there is sufficient spare capacity, replacement of failed components can be deferred to bi-annual or multi-year cycles.
A further trend is the growing use of low cost serial ATA (SATA) disks to build high performance systems. SATA disks have hard error rates that are 10 times higher than comparable SCSI disks while costing about 35-55% less. Consequently, providing various data reliability levels using cheaper but less reliable disks requires a greater variety of RAID codes.
One conventional approach, RAIDframe, focuses on providing firmware environments that permit rapid prototyping and evaluation of redundant disk array architectures. Although this approach has proven to be useful, it would be desirable to present additional improvements. This approach modularizes the basic functions that differentiate RAID architectures: mapping, encoding, and caching. This modularization allows each aspect to be modified independently, creating new designs. In RAIDframe, array operations are modeled as directed acyclic graphs (DAGs) that specify the architectural dependencies (and execution) between primitives. RAIDframe provides no simplification or automation of error handling. Furthermore, RAIDframe has no ability to automatically tune performance.
Other conventional approaches utilize a RAID system-on-a-chip (SOC) product. Although these approaches have proven to be useful, it would be desirable to present additional improvements. For example, one such approach contains an embedded processor, DMA/XOR engines, and host and disk interface logic (FC and SATA). Since the processor is programmable, it is conceivable that this approach may support a variety of RAID codes. However, all error paths are specified as callbacks written by the developer. Further, tuning of this performance approach is only minimally automatic, if at all.
What is therefore needed is a solution to provide a variety of RAID codes without compromising the quality, performance, and maintainability of the firmware. Such a method allows for adding new RAID codes without bloating the firmware. Many RAID codes have been proposed such as, for example, EVENODD, generalized EVENODD, X-code, RDP, and WEAVER. A common trend in these RAID codes is to focus on XOR-based RAID codes since they can be efficiently implemented in hardware or software. Non-XOR codes such as LDPC and Reed-Solomon have been developed, but these have not become popular since they offer no special advantage over XOR based codes.
A solution is required to automatically handle any RAID code related error. An example of such an error path is, on a read, encountering a read error due to a failed sector or disk. Since error handling is a large fraction of a firmware, the solution should unify fault-free and fault-ridden cases into common code paths.
The solution is further required to simplify nested error paths inherent in firmware. An example of a nested error path is when, in the process of reconstructing a lost block due to a previous failure, a new sector or disk failure is discovered. If the RAID code encounters such nested error paths, the solution is required to automatically determine how to reconstruct the lost block.
Furthermore, the solution is required to automatically tune performance for every I/O operation performed by leveraging: dynamic state such as currently cached pages.
What is therefore needed is a system, a service, a computer program product, and an associated method for providing a generic RAID engine and optimizer. Such a system should work for any XOR-based erasure (RAID) code and under any combination of sector or disk failures. The need for such a solution has heretofore remained unsatisfied.