Data storage systems are an integral part of today's enterprise, Internet and service provider infrastructure solutions. In general, there is an underlying requirement to reliably store vast quantities of information and to be able to rapidly access and deliver this stored information at high speeds under conditions of simultaneous demand from many users.
The increasing capacity of hard disk drive technology over the years has satisfied the requirement for cost-effective storage of information. However, the mechanical nature of the spinning magnetic platters and actuating arms of the hard disk has limited the bandwidth of writing and reading information. Also, the mechanical nature of hard drives makes them more failure prone, requiring additional techniques for reliable storage.
RAID (Redundant Arrays of Inexpensive Disks) is a common storage technology used in the industry to overcome bandwidth and reliability limitations of single disk drives. As illustrated in FIG. 1, multiple disk drives are arrayed with information stored in a striped fashion, where each disk is given just a portion of a file. In this manner, a file written to or read from the disk array can be done in parallel whereby the bandwidth of reading/writing is equal to the aggregate bandwidth of the disks. In this way, the bandwidth of the RAID increases proportionally with the number of disks in the array; so two disks in a RAID can have twice the bandwidth of a single disk and three disks three-times the bandwidth and so on. Moreover, parity information can be included in the stripe to allow information to be retrieved even under conditions of disk failure, thus accommodating the requirement of reliable information storage. RAID technology is well known in the industry, described in U.S. Pat. No. 4,092,732, “System for Recovering Data Stored in Failed Memory Unit,” expanded in Patterson, et al, “A Case for Redundant Arrays of Inexpensive Disks (RAID),” SIGMOD Conference 1988, pp 109-116 and later surveyed by Chen et al, “RAID: High-Performance, Reliable Secondary Storage,” ACM Computing Surveys, June 1994, pp 145-185.
RAID technology is particularly well suited for and commonly used in video streaming storage systems where continuous multimedia content (e.g., video, audio, data) is streamed from storage to television set top boxes (STBs), personal computers, mobile phones and other multimedia devices at a rate compatible with the continuous and uninterrupted display of the content to the user. Streamed multimedia content, likely encoded in a compressed format such as ITU Recommendation H.262 (MPEG-2) or H.264 (MPEG-4 Advanced Video Coding), is stored in a striped fashion with parity in a RAID system. Bandwidth is increased by incorporating more disks in the array and storage capacity is increased by increasing the number of disks in the array and/or increasing the capacity of each disk.
However, there are limits to how large one can make a RAID. While increasing the number of disks increases the bandwidth of the storage system, it also decreases the reliability as the mean-time-to-failure of a 10-disk array is 10-times shorter than a single disk. Incorporating parity in the stripe helps alleviate reliability problems, however, with large arrays one has to consider protecting against multiple disk failures. Simple parity schemes provide protection against single disk failures and using Reed-Solomon coding techniques one can protect against multiple disk failures (see Plank, “A Tutorial on Reed-Solomon Coding for Fault-Tolerance in RAID-like Systems,” Software, Practice & Experience, September, 1997, pp. 995-1012). Practical constraints typically limit RAID storage systems to fewer than twenty-four disks.
By way of example, a High Definition (HD) MPEG-2 video stream can consume as much as 19 Mb/s. The read access bandwidth of a single disk can support approximately twenty such HD MPEG-2 streams so that a RAID storage system with twenty-four disks can support access for four hundred and eighty unique streams, not accounting for disk redundancy with parity or other error correcting coding techniques. A telco video hub office or cable headend providing service to 200,000 homes, each home with three HD television sets, would need the equivalent of one hundred and twenty-five such 24-disk RAID storage systems to satisfy the streaming bandwidth requirements with only ten percent of the served televisions concurrently receiving video on demand programming. Newer video coding algorithms like MPEG-4 AVC can reduce the required stream bandwidth by more than one-half, but still not sufficient to satisfy the streaming requirements of a telco video hub office or cable headend with a single RAID storage system.
FIG. 2 illustrates an exemplary system to increase the RAID size, and consequently increase the overall available bandwidth of the storage system, without reducing its reliability is to employ a two-dimensional disk array with parity checking along both dimensions. Content blocks are striped horizontally at the disk controller level and then striped vertically from each disk controller to the disks under its control. Parity is computed vertically to protect against disk failures and computed horizontally to protect against controller failures. The content is effectively striped across all disks in the storage system to provide an effective bandwidth for the aggregate of served streams equal to the sum of the bandwidths of all disks. The storage system reads from each and every disk in the storage system in order to check the parity along both dimensions and correct any errors before streaming the content. Each disk controller reads a block from each disk under its control and checks/corrects errors in the vertical directions, and the storage system controller then checks for errors along the horizontal direction, after which the content can be streamed. This results in a large memory buffer for each stream equal to the stripe size of the entire two-dimensional disk array. There is also increased latency before streams can be delivered as each stream must contend for access to the storage system before retrieving a complete data stripe.
The streaming bandwidth of a RAID storage system can be increased without resorting to large disk arrays, and their associated drawbacks, through the use of a Dynamic Random Access Memory (DRAM) cache. Here, highly popular content is cached in DRAM from the RAID storage system and simultaneously streamed at high bandwidth to multiple users, often limited only by the capacity of the network interface ports of the storage system. DRAM, however, is expensive and consumes significant power, and as a consequence there are practical limits to the amount of content that can be stored in DRAM. A two-hour HD movie encoded at 19 Mb/s using MPEG-2, consumes 17 G bytes of storage where practical limits of off-the-shelf server technology places a limit of 64 G bytes of DRAM storage, not quite enough for four movies. Again, MPEG-4 AVC can more than double this, but even eight or nine movies are hardly sufficient storage given the diversity of tastes and interests of viewers. Special purpose designed DRAM streaming storage systems, accommodating one or more Terabytes of DRAM content storage, provides a larger cache, but still relatively small compared to desired content library sizes, and comes at the expense of much greater cost and power consumption.
To increase video streaming capacity beyond that of a single RAID storage system, video servers are often deployed using clustering techniques. Here multiple streaming servers, each with its own RAID, are used to serve a group of users whose streaming bandwidth requirement exceeds that of a single streaming server. Each streaming server can be provisioned with identical content files to ensure that any user assigned to a streaming server in the cluster can gain access to a desired content. Alternatively, different content can be allocated to different streaming servers and then users assigned dynamically to streaming servers after content selection is made. Here, a larger content library can be offered to users owing to the larger storage afforded by multiple RAID storage systems, each with different content files. However, popular content may need to be stored on multiple streaming servers and content files may need to be moved from server to server to facilitate load balancing of user requests for content. Since the process of replicating content files on servers and moving content files among servers for load balancing is not instantaneous, there is an inherent inefficiency in assigning content and users to servers, and in reacting to sudden changes in the popularity of and demand for certain content.