In today's information intensive environment, there are many businesses and other institutions that need to store huge amounts of digital data. These include entities such as large corporations that store internal company information to be shared by thousands of networked employees; online merchants that store information on millions of products; and libraries and educational institutions with extensive literature collections. A more recent need for the use of large-scale data storage systems is in the broadcast television programming market. Such businesses are undergoing a transition, from the older analog techniques for creating, editing and transmitting television programs, to an all-digital approach. Not only is the content (such as a commercial) itself stored in the form of a digital video file, but editing and sequencing of programs and commercials, in preparation for transmission, are also digitally processed using powerful computer systems. Other types of digital content that can be stored in a data storage system include seismic data for earthquake prediction, and satellite imaging data for mapping.
A powerful data storage system referred to as a media server is offered by Omneon Video Networks of Sunnyvale, Calif. (the assignee of this patent application). The media server is composed of a number of software components that are running on a network of server machines. The server machines have mass storage devices such as rotating magnetic disk drives that store the data. The server accepts requests to create, write or read a file, and manages the process of transferring data into one or more disk drives, or delivering requested read data from them. The server keeps track of which file is stored in which drives. Requests to access a file, i.e. create, write, or read, are typically received from what is referred to as a client application program that may be running on a client machine connected to the server network. For example, the application program may be a video editing application running on a workstation of a television studio, that needs a particular video clip (stored as a digital video file in the system).
Video data is voluminous, even with compression in the form of, for example, Motion Picture Experts Group (MPEG) formats. Accordingly, data storage systems for such environments are designed to provide a storage capacity of tens of terabytes or greater. Also, high-speed data communication links are used to connect the server machines of the network, and in some cases to connect with certain client machines as well, to provide a shared total bandwidth of one hundred Gb/second and greater, for accessing the system. The storage system is also able to service accesses by multiple clients simultaneously.
To help reduce the overall cost of the storage system, a distributed architecture is used. Hundreds of smaller, relatively low cost, high volume manufactured disk drives (currently each unit has a capacity of one hundred or more Gbytes) may be networked together, to reach the much larger total storage capacity. However, this distribution of storage capacity also increases the chances of a failure occurring in the system that will prevent a successful access. Such failures can happen in a variety of different places, including not just in the system hardware (e.g., a cable, a connector, a fan, a power supply, or a disk drive unit), but also in software such as a bug in a particular client application program. Storage systems have implemented redundancy in the form of a redundant array of inexpensive disks (RAID), so as to service a given access (e.g., make the requested data available), despite a disk failure that would have otherwise thwarted that access. The systems also allow for rebuilding the content of a failed disk drive, into a replacement drive.
A storage system should also be scalable, to easily expand to handle larger data storage requirements as well as an increasing client load, without having to make complicated hardware ands software replacements.