RAID groups are logical representations of disk arrays created by binding individual physical disks together to form the RAID groups. RAID groups represent a logically contiguous address space distributed across a set of physical disks. Each physical disk is subdivided into pieces used to spread the address space of the RAID group across the group (along with parity information if applicable to the RAID level). The physically contiguous pieces of the physical disks that are joined together to create the logically contiguous address space of the RAID group are called stripes.
Applications access and store data incrementally by use of logical storage array partitions, known as logical units (LUNs). LUNs are exported from a RAID array for use at the application level. For conventional systems, LUNs always map to physically provisioned contiguous storage space. This physical provisioning results from the fact that traditional LUN mapping technologies bind LUNs from RAID groups using static mapping. Static mapping provides that a LUN is defined by a start position in a RAID group and that the LUN extends for its size from that position contiguously in the RAID group's address space. This static mapping yields a logical unit mapping of 1:1 for logical to physical mapping of blocks from some start point in the RAID group's address space on the array.
Because this mapping was simple, it was viewed as the most efficient way to represent a logical unit in any system from the point of view of raw input/output (I/O) performance. The persistent definition of the logical unit as a contiguously provisioned unit made it manageable for storage and retrieval, but imposed limits on the scalability of data storage systems.
However, while the persistent nature of the LUN defined in this manner has been manageable for storage and retrieval, it has become inefficient with the increased usage of array replication features. Array replication features may be characterized by an environment where different versions of data must be maintained and moved around a system quickly and in a transparent manner. Control of the placement of data has become more important. The ability to quickly and efficiently represent different versions of a LUN is also starting to become a more important factor to optimize due to customer needs for quick data recovery after system failures.
Storage space is a consideration of growing concern. As systems increase in size and complexity, storage requirements increase proportionally to the information that must be archived for replication, storage, and retrieval operations. Snapshots and clones are often used in storage systems to identify changes in the storage contents over time. When snapshots are used for point-in-time copies, LUNs are referenced by the snapshot. When data changes are requested for data regions associated with the LUN, they are tracked by the snapshot by the allocation of new storage extents to store the original data referenced by the LUN. The original data is copied to the newly allocated storage extents and the requested data change is written to the storage extents referenced by the LUN. When clones are used, LUNs are copied in their entirety to new storage regions. Cloning allows LUNs to be copied in their entirety to new storage space for archival and restoration purposes. The storage requirements associated with conventional bulk data archival typically limit concurrent restore points due to limitations in available disk space.
As an example of both temporal and storage limitations in conventional systems, convention LUN architectures do not provide a mechanism for sharing of data segments on the array. As described above, snapshots are cloned copies of LUNs and snapshots must be copied in their entirety before they can be used. By not allowing data sharing, storage limitations, as described above, exist in conventional systems for the number of restore points that may be economically maintained. Further, in order to effectively have a valid restore point, the entire LUN must be copied. Any change in data stored in a storage extent within a LUN will result in loss of a restore point for any data changed prior to completion of the cloning operation. Accordingly, both temporal and storage limitations exist in conventional systems due to a lack of data sharing capabilities.
As another example of both temporal and storage limitations for conventional LUN operations, conventional systems use bitmaps to track changed storage locations. Given the increasing size of certain systems, memory is consumed in proportional to the size of the bitmaps that are required to track changes. This increasing memory usage imposes a need to use paging schemes for change tracking bitmaps which results in increased latency during paging operations. Accordingly, for operations related to system information processing for replication, storage, and retrieval of snapshots, both a temporal and a storage limitations result.
Temporal limitations also exist in conventional systems for clone and snapshot archival. In order to preserve the original structure of the LUN for that point in time, blocks are copied on the first write reference (copy on first write) to a special save area reserved to hold these “historical” blocks of data. This copy involves a read/write cycle that causes significant performance disruption just after the point-in-time copy is created against an actively changing production LUN. This disruption may continue for some amount of time until most of the copying is completed and sometimes this can last for hours. In an array environment where snapshots are constantly being created, the performance impacts of conventional systems become significant.
Another source of temporal limitation for conventional systems relates to an operation known as silvering. Silvering is a process of copying data from one mirror image to another to get them into synchronization. Conventional clones are also required to use completely new storage for clone creation. Clones often have to be silvered before they can be used.
Accordingly, in light of these difficulties associated with conventional RAID array LUN replication, storage, and retrieval, there exists a need for improved methods, systems, and computer program products for mapped logical unit (MLU) replication, storage, and retrieval in a RAID environment.