Many applications, including video surveillance applications, generate significant amounts of data which may be stored on hard disk drives in computer systems. For larger scale installations, hardware fault tolerance may be built into the system so that recording will not be interrupted on hardware failure.
To protect against disk drive failures, one conventional method is to keep a copy of the data on two separate disk drives. In the event of a failure of one disk drive, the data may be recovered from the other drive. In another conventional method, it may be more cost effective to keep parity information of data distributed among a group of disk drives. Parity information generally refers to a smaller amount of data representing a larger data set. In the event of a failure of one disk drive, the parity data from the remaining drives may be used to reconstruct the lost data. Similarly, to protect against whole enclosure of drives failing, in another conventional method it may be more cost effective to distribute parity information amongst drives grouped in multiple enclosures rather than restricting parity information to a group of drives housed in a single enclosure.
Traditional redundant data storage schemes or architectures built on commodity hardware, such as Redundant Array of Independent Disks or RAID, implement non-distributed RAID; that is each node can apply RAID only to the physical disks it directly owns. Distributed RAID may be used; where each node participates in RAID so as to distribute parity information across the disks of many nodes. Unfortunately, in a distributed RAID system all RAID operation becomes subject to network traffic and therefore may undesirably affect the performance of the system. For example, a disk failure is handled by rebuilding over the network, which may require on the order of 10*2 TB of data movement. Moving this much data over a network may be time consuming and may adversely affect system performance.
To protect against disk access system component system failures, one conventional method is to split the compute and storage components of the system into separate physical machines, and provide redundant data-paths to connect the compute nodes to the storage nodes. This method utilizes dual-port disk drives and dual disk controllers in the storage enclosure. Accordingly, the disk drives have redundant lanes and interconnects, and multiple controllers are provided. Unfortunately, redundant data-path systems may not be cost effective for applications such as video surveillance.