Media server design is an important aspect of the ongoing effort to provide widespread availability of interactive multimedia services such as video-on-demand (VOD), teleshopping, digital video broadcasting and distance learning. A media server retrieves digital multimedia bit streams from storage devices and delivers the streams to clients at an appropriate delivery rate. The multimedia bit streams represent video, audio and other types of data, and each stream may be delivered subject to quality-of-service (QOS) constraints such as average bit rate or maximum delay jitter. An important performance criteria for a media server and its corresponding multimedia delivery system is the maximum number of multimedia streams, and thus the number of clients, that can be simultaneously supported.
The data retrieval method used in a given media server is a significant limitation on the stream delivery capability of that server. Disk-based storage devices are generally subject to mechanical delays due to disk seeking time, disk start-up and settle time, and disk rotation speed. The effect of certain of these mechanical delays on server stream delivery capability may be alleviated by providing parallel access to multiple storage devices as in the conventional Redundant Array of Inexpensive Disks (RAID) system. However, the performance of these and other parallel access retrieval techniques remains limited by the use of random access file placement as well as unavoidable overhead such as disk seeking time and Small Computer System Interface (SCSI) bus contention. For additional details on the performance of parallel access techniques, see C. S. Wu et al., "Performance Evaluation of a Disk Array for Video-on-Demand Systems," Conference Proceedings, 10th International Conference on Information Networking (ICOIN-10), Kyung-Ju, Korea, 1996, pp. 351-356, which is incorporated by reference herein. The use of sequential access file placement in place of random access placement can improve retrieved data throughput by a factor of two or more for relatively small numbers of clients. However, when a large number of clients simultaneously access the server, sequential access techniques perform substantially like random access techniques due to interleaving of multiple retrieval requests.
Reductions in disk seeking overhead may be achieved by careful design of disk scheduling, request admission control and media data placement. A number of disk scheduling algorithms are known in the art, including first-come first-served, shortest seek time first and scan or elevator algorithms. Although useful in reducing seek time, reducing rotational latency, increasing throughput and providing fair access, these scheduling algorithms generally cannot satisfy real-time video performance constraints. For example, an exemplary scan or elevator algorithm scans the disk head back and forth to limit backtracking and significantly reduce seek time. However, video files generally exhibit little locality between disk access locations for different streams, and the probability of two or more clients accessing the same media stream within a short time period is very difficult to predict. Scan algorithms therefore generally cannot satisfy real-time constraints.
A known disk scheduling algorithm suitable for real-time scheduling of tasks with deadlines is the earliest deadline first (EDF) algorithm. However, application of this algorithm to video files is likely to introduce excessive seek time and rotational latency and yield poor server resource utilization. A scan-EDF algorithm has been proposed for real-time applications in A. L. Narasimha Reddy and J. C. Wyllie, "I/O Issues in a Multimedia System," IEEE Computer, pp. 69-74, March 1994, which is incorporated by reference herein. The scan-EDF algorithm services requests with earliest deadlines first. When many requests have the same or similar deadlines, as is characteristic of video-on-demand and other interactive multimedia systems, the corresponding data blocks are essentially accessed using the above-described scan algorithm only.
FIGS. 1A, 1B and 1C illustrate the relative round length and maximum time between retrievals for a number of different disk scheduling algorithms. A "round" refers to a series of retrievals during which a block sequence of arbitrary length is retrieved for each currently-requested media stream. The use of rounds is particularly important in meeting the real-time constraints of interactive multimedia systems. FIG. 1A illustrates exemplary round lengths and maximum time between same-stream retrievals for successive rounds of a round-robin algorithm. The round-robin algorithm services the requested streams in a fixed order in every round. The first stream retrieved in round i is therefore also the first stream retrieved in round i+1. The maximum latency between retrieval times of successive requests of a given stream is bounded by the period of a single round, as shown in FIG. 1A. A server utilizing a round-robin algorithm therefore need only include enough buffer space to satisfy data consumption for one round. A major drawback of the round-robin scheduling algorithm is that it does not exploit the relative positions of media blocks being retrieved during a given round, and an unusual sequence in each round can yield an unpredictably long seek latency. This underscores the importance of considering data placement algorithms in conjunction with the disk scheduling algorithms.
FIG. 1B illustrates successive rounds and maximum time between same-stream retrieval for a scan or elevator algorithm. The scan algorithm generally services a stream in an order which depends on the relative placement of the media blocks being retrieved. A given stream may therefore be serviced at the beginning of one round and at the end of the next round, resulting in the maximum retrieval latency of nearly two full round periods as illustrated. Compared to the round-robin algorithm, the rounds are shorter in the scan algorithm but latency between successive same-stream retrievals may be longer. In addition, a server utilizing the scan algorithm generally requires enough buffer space to satisfy data consumption for nearly two rounds.
FIG. 1C illustrates successive rounds and maximum time between same-stream retrieval for a sub-grouping algorithm. The sub-grouping algorithm partitions each round into groups, and assigns each requested stream to one of the groups. The groups are then serviced in a fixed order in each round, while a scan algorithm is used to service the particular streams within each group. If all of the requested streams are assigned to the same group, the sub-grouping algorithm reduces to the scan algorithm. If each requested stream is assigned to its own unique group, the sub-grouping algorithm reduces to the round-robin algorithm. The maximum time between retrieval is thus between a single round period and the single round period plus nearly one additional group time. For example, a given stream assigned to Group 1 may be serviced in accordance with the scan algorithm at the beginning of Group 1 in round i but at the end of Group 1 in round i+1. The sub-grouping algorithm generally produces a round length greater than that of the scan algorithm but less than that of the round-robin algorithm, and a maximum retrieval latency less than that of the scan algorithm but slightly greater than that of the round-robin algorithm.
The round-based processing of FIGS. 1A through 1C are designed to allow a media server to meet real-time constraints. In order to prevent "starvation" of a client requesting a particular stream, it may also be desirable for a scheduling algorithm to have the buffer-conserving property described in D. J. Gemmell and J. Han, "Multimedia Network File Servers: Multichannel Delay Sensitive Data Retrieval," ACM Multimedia Systems, pp. 240-252, April 1994, which is incorporated by reference herein. The buffer-conserving property is also referred to as work-ahead-augmenting in D. Anderson, Y. Osawa, and R. Govindan, "A File System for Continuous Media," ACM Trans. on Computer Systems, pp. 311-337, November 1992, which is also incorporated by reference herein. The buffer-conserving property is a sufficient condition for preventing the starvation condition which results when a client requesting a particular stream is not delivered sufficient stream data to satisfy real-time constraints. A scheduling algorithm has the buffer-conserving property if the data retrieval rate is prevented from lagging the consumption rate and there is never a net decrease in the amount of buffered data on a round-by-round basis. It may also be necessary to prefetch sufficient stream data to meet the consumption requirements of the longest possible round. Since the round length depends on the number of blocks retrieved for each requested stream, the round length can be minimized if the number of blocks retrieved for a given stream during each round is proportional to the consumption rate of that stream. A non buffer-conserving scheduling algorithm, which is generally more complex, allows the data retrieval rate to fall behind the consumption rate in one round but compensates for the shortfall in a later round.
Admission control is an another important factor in media server design. A media server must determine when new data delivery requests can be accommodated by the system while maintaining desired quality-of-service (QOS) levels for all requests. In accordance with the above-described disk scheduling algorithms, accepting a new request will increase the length of a service round, and may prevent the server from providing an agreed-upon QOS to a current client. A typical media server may offer three broad quality-of-service (QOS) classes: (i) deterministic, in which all data delivery deadlines are guaranteed to be met; (ii) statistical, in which deadlines are guaranteed to be met with a certain probability; and (iii) best-effort, in which no guarantees are given for meeting deadlines. For deterministic services, resources may be reserved in worst-case fashion for each requested multimedia stream. The server may also check whether buffering for existing streams is adequate to prevent starvation of any client before admitting another service request and increasing the length of a service round, as described in greater detail in the D. Anderson et al. reference cited above and in H. M. Vin and P. Venkat Rangan, "Designing a Multi-User HDTV Storage Server," IEEE J. Selected Areas in Comm., pp. 153-164, January 1993, which is incorporated by reference herein.
For statistical services, the server generally computes the change to round length based on statistical values. The server may also drop media blocks, dynamically vary media resolution levels or use other strategies to resolve a missed deadline. Maximum usage of limited data retrieval resources may be achieved subject to QOS commitments using an algorithm that dynamically calculates real-time requirements and carefully lengthens round duration to admit new requests. One such algorithm is described in greater detail in C. S. Wu et al., "On Scalable Design of an ATM-based Video Server," IEEE International Conference on Communications, 1996 (SUPERCOMM/ICC '96), paper no. 44-1, which is incorporated by reference herein.
Media data placement techniques are another important factor in media server design and are particularly useful for reducing disk seeking overhead. A given media bit stream can be stored contiguously or split up into separate portions which are "scattered" into separate disks and disk zones of a disk-based storage device. Contiguous storage is relatively simple to implement but the stored streams are subject to fragmentation when the end of a disk or disk zone is reached before the full stream is stored. In contrast, scattered placements avoid fragmentation and corresponding copying overheads. The decision as to whether to utilize contiguous storage or scattered storage may be based on an analysis of intrastream seeks. Retrieving a contiguously-stored bit stream requires only one seek to position a disk head at the start of the stream. On the other hand, retrieving several blocks of a scattered bit stream may require a separate seek for each block read. Even when retrieving a relatively small amount of data, it is possible that part of the data might be stored in one block and the rest in the next block, such that an intrastream seek is required.
Intrastream seeks can be avoided to some extent in scattered storage by designing the scheduling algorithm such that the amount of data read for a given stream always fits within a single block. This could be provided by, for example, selecting a sufficiently large block size and reading one block of the given stream in each round. However, if more than one block is required to prevent starvation prior to the next round, an intrastream seek may be unavoidable. The effects of intrastream seeks may be alleviated through the use of constrained placement techniques which limit the separation between successive stream blocks. Although such techniques are attractive when the block size must be small, additional complexities are required to ensure that separation between blocks conforms to the required constraints, and the scheduling algorithm may need to be modified to retrieve all blocks for a given stream before switching to any other stream.
In a situation in which an entire media stream is stored on a single disk of a disk-based storage device, the number of concurrent accesses to that media stream is limited. Scattered storage overcomes this limitation by using techniques such as stream stripping and stream interleaving. A stream stripping technique utilized in the above-noted RAID system "strips" a given media stream by separating it into distinct portions, and stores the portions across an array of disks such that parallel access can be achieved. One block of each stream may then be retrieved from each disk in each round, in accordance with the above-described scheduling algorithms. If a multiple-disk set of the storage device is spindle synchronized and operated in a lock-step parallel mode, different physical sectors of each disk can be accessed in parallel as a single large logical sector. Because accesses are performed in parallel, logical sector blocks and physical sector blocks will generally have identical access times. The data retrieval throughput therefore increases as a function of the number of disks involved. In a single disk configuration, data retrieval throughput can be improved by increasing the size of the physical sector blocks. However, the block size cannot be increased in an unlimited manner since this would unduly increase the logical sector block size and consequently lengthen startup delays and enlarge the buffer space requirements for each stream.
Stream interleaving techniques generally involve interleaving blocks across the disk array for storage. A simple interleaving pattern stores the blocks cyclically across the disk array with successive stream blocks stored on different disks. The disks of the disk array are not spindle synchronized and can therefore operate independently. At least two different stream retrieval methods may be used with the stream interleaving storage technique. One retrieval method is similar to that used with the stripping storage described above where one block is retrieved from each disk in every round. This method ensures a balanced retrieval load but generally requires more buffer space. The other retrieval method retrieves blocks from one of the disks for a given requested stream in each round, such that the stream retrievals are interleaved rather than simultaneous. The retrieval load for each round is balanced across the disks to maximize the throughput. The load can be balanced by interleaving the streams such that all streams have the same round length but each stream considers the round to begin at a different time.
It can be seen from the above-described description of conventional media placement techniques that contiguous placement techniques limit the number of simultaneously-requested streams and therefore the number of clients in a multimedia delivery system. Although scattering techniques can increase the data retrieval throughput by introducing concurrent access, the seek latency is increased and factors such as load balancing and buffer management introduce additional complexity in the throughput maximization process. Furthermore, known scattering techniques generally need to store a complicated table-based mapping to keep track of the disk zone or zones in which portions of a particular stored data stream are located. Conventional data placement techniques are thus unable to provide a high throughput multimedia delivery system capable of simultaneously servicing a large number of clients.
A number of other prior art techniques are described in U.S. Pat. No. 5,519,435 issued May 21, 1996 to M. H. Anderson, assigned to Micropolis Corp. and entitled "Multi-user, On-Demand Video Storage and Retrieval System Including Video Signature Computation for Preventing Excessive Instantaneous Server Data Rate," U.S. Pat. No. 5,510,905 issued Apr. 23, 1996 to Yitzhak Birk and entitled "Video Storage Server Using Track-Pairing," U.S. Pat. No. 5,517,652 issued May 14, 1996 to Takanori Miyamoto et al., assigned to Hitachi Ltd. and entitled "Multi-media Server for Treating Multi-media Information and Communication System Employing the Multi-media Server," and U.S. Pat. No. 5,473,362 issued Dec. 5, 1995 to R. P. Fitzgerald et al., assigned to Microsoft Corp. and entitled "Video on Demand System Comprising Stripped Data Across Plural Storable Devices With Multiplex Scheduling." These references generally utilize data stripping methods to store the multimedia data streams in a manner that facilitates media-on-demand services. However, the resulting improvement in data stream throughput remains substantially limited due to unavoidable mechanical delays in the disk-based storage system. These limitations are described in greater detail in, for example, C. S. Wu et al., "Performance Evaluation of a Disk Array for Video-on-Demand Systems," Proceedings of 10th International Conference of Information Networking (ICOIN-10), Kyung-Ju, Korea, 1996, pp. 351-356, which is incorporated by reference herein.
As is apparent from the above, a need exists for an improved media server which maximizes the number of simultaneously-supported multimedia streams. The media server should support sequential-like parallel retrieval capabilities while maintaining a desired data delivery bandwidth, reducing seek latency and avoiding the complicated table-based mapping and other problems associated with conventional data placement and retrieval techniques.