In a large scale computing system, failures frequently happen, and checkpointing may be used to improve reliability for a system. Computing systems may employ checkpointing to insert fault tolerance into the system. Computing systems may employ local checkpointing schemes and global checkpointing schemes. A Redundant Array of Independent Disks (RAID) may also be used to insert fault tolerance into a system. A RAID may be used to increase storage reliability through redundancy by combining disk drive components into a logical unit where drives in an array may be interdependent. A. RAID may include computer data storage schemes that can divide and/or replicate data among multiple disk drives.