The invention relates to a storage system such as a video on demand system, comprising a plurality of storage units for storing blocks of data such as video or audio data, and a reader for retrieving the blocks from the plurality of storage units.
A video on demand system is known from the article xe2x80x9cRandomized data Allocation for Real-time Disk I/Oxe2x80x9d, Compcon ""96 (41st IEEE Computer Society International Conference, Santa Clara, Feb. 25-28, 1996). In the known system a data unit is striped over G disks forming a group of G blocks, each occupying one full track on a disk. The group of G disks are randomly selected from the set of D available disks. One of the G blocks is a parity block. The random distribution of the groups results in balancing of the load on the disks. The redundant parity block enables further load balancing, which is achieved by reading only those (G-1) blocks of a group of G blocks that minimize the load. If the unread block is a data block, the parity block is used to reconstruct the unread data block.
It is an object of the invention to provide a storage system as specified in the preamble, which more efficiently uses disk bandwidth. To this end, the invention provides a storage system as defined in claim 1.
In a storage unit that uses movement of a read head with respect to a storage medium for retrieval of data, an example of such a storage unit being a disk drive, the expected retrieval time of some piece of data comprises as a first component the switch time required to position the head on the right location, e.g. on the right track and on the right position within that track, and as a second component the time necessary for the actual read process. If we assume that large amounts of contiguous data are read at a time, the expected retrieval time of a piece of data is largely determined by this second component. As in most storage units the time needed for actually reading an arbitrary piece of data depends on the location of that piece of data within the storage space, the expected retrieval time varies across the storage space. In a disk drive, for example, the storage capacity of the outer tracks is higher than that of the inner tracks. In current disk drives, the track capacity does not vary continuously but stepwise, so that a finite number of storage zones can be identified, typically in the order of ten to twenty, in each of which all tracks have equal storage capacity. In view of the constant angular velocity of a disk drive, the expected retrieval time of a block stored in a storage zone close to the spindle of the disk drive is higher than that of a block stored in a storage zone near the outer perimeter of the disk. In accordance with the invention, a selection procedure is used that takes into account the expected retrieval time, which enables more efficient use of the available disk bandwidth by preferably selecting blocks with low expected retrieval time for retrieval. This is especially advantageous for video servers in which disk bandwidth determines the system bottleneck instead of the storage capacity. More efficient use of disk bandwidth also results in less stringent buffer requirements and better response times.
Preferably, the selection procedure further takes into account a load distribution in the plurality of storage units. In this way, the freedom that is offered by the redundant information is used for achieving both load balancing and improvement of bandwidth utilization at the same time. A further advantage is that the minimal overall retrieval time of a data unit fluctuates less from one data unit to another. The selection procedure could be invoked on each requested data unit individually, in which case the selection procedure tries to determine the best selection in accordance with some criteria without regarding other data units. Alternatively, the selection procedure could be invoked on a batch of data units pending to be served. In the latter way, the freedom can be utilized even better for load balancing and/or improving the bandwidth utilization. For example, a selection that is optimal for a particular data unit might cause problems for subsequent data units from a load balancing point of view. In such a case, for the selection procedure it is better to address a number of data units at a time.
At an arbitrary moment in time, each storage unit has an empty or a non-empty queue of read requests for respective blocks that are waiting to be retrieved from that storage unit. As the load distribution to be taken into account, the selection procedure could take the queue lengths as a starting point, i.e. the selection procedure could have as objective to level the queue lengths. Alternatively, the selection procedure disregards the actual queue lengths and merely tries to equally distribute the load of the current requested data unit or batch of data units over the storage units. Both methods are essentially the same when the selection procedure is invoked after the blocks of the previous data unit or batch of data units have been retrieved, since at that moment the queue lengths are zero.
Further advantageous aspects of the invention are described in the dependent claims.
The invention also relates to a storage system comprising a plurality of storage units and a loader for storing data units in the storage system. The invention further relates a method of storing and a method retrieving data units in a system comprising a plurality of storage units.
These and other aspects of the invention will be apparent from and elucidated with reference to the embodiments described hereinafter.