The proliferation of broadcast television stations and cable operators have provided television viewers with a large and ever-increasing variety of content choices. Traditional delivery systems, however, require temporally fixed programming--that is, all viewers must tune into a particular broadcast at the time it is shown. This frees the broadcaster from the need to establish separate interactive communication circuits with individual viewers, allowing the audience to increase virtually without limit and without an increase in utilized bandwidth.
For over a decade, attempts have been made to circumvent this programming model and allow viewers to tune into broadcasts at their convenience rather than at a single scheduled time. Ideally, it should be possible for many viewers to obtain access to particular content at arbitrarily different times, notwithstanding the time of day or the number of users simultaneously requesting access. But efforts to realize true "video-on-demand" (VOD) in widespread form, with any subscriber free to request any video program at any time, have yet to bear fruit.
The reason for the slow progress lies in the difficulty of inexpensively delivering content in a manner that scales to a very large number of receivers. The most intuitively simple solution, in which a central service provides separate transmissions of the same program to individual subscribers upon their requests--in effect, re-running a video over an independent communication channel to each viewer--requires duplication of equipment and substantial bandwidth resources. For this reason, much research has been devoted to batching algorithms that aggregate requests from many receivers into a smaller number of common channels. Aggregation is achieved by delaying the stream for one or more receivers by a small amount of time so that it may be merged with subsequent streams.
Aggregation algorithms may be "user-centered" or "data-centered." User-centered aggregation techniques allocate data channels in response to user requests. For example, if two subscribers issue requests for the same video a small time interval apart, then by delaying the playback for the first request, both requests can be satisfied from the same server data stream. In one user-centered approach, called "scheduled rnulticast," when a server channel becomes available the server selects, based on a scheduling policy, a group of users to which the video is transmitted. For example, in accordance with Dan et al., "Scheduling Policies for an On-Demand Video Server with Batching," Proc. of ACM Multimedia, pages 15-23 (October 1994), the batch with the largest number of pending requests is served first, with the objective of maximizing server throughput.
Data-centered techniques allocate transmission channels to pieces of the transmitted content in a predefined manner, relying on the receivers to determine the proper channel from which to receive data. For example, in accordance with the "periodic broadcast" approach, videos are broadcast periodically over a plurality of channels so that a new multicast data stream is started every B minutes (the "batching interval"). In this way, no subscriber can experience a service latency of more than B minutes. Decreasing the batching interval naturally requires a proportionate increase in server bandwidth.
The "pyramid broadcasting" (PB) technique was developed to reduce the service latency without linear bandwidth increases. See Viswanathan et al., "Pyramid Broadcasting for Video on Demand Service," IEEE Multimedia Computing and Networking Conf. 2417:66-77 (1995). In accordance with the PB technique, each video data file is partitioned into K segments of geometrically increasing size with the server transmission capacity evenly divided into K logical channels. Each channel broadcasts an assigned video segment repeatedly, in an infinitely looping fashion. The subscriber's receiver sequentially downloads the various video segments, playing back previously downloaded segments even as new ones are loaded. Thus, playback can commence as soon as the first segment is fully downloaded; and since this segment is the shortest, the period of download before playback can commence is relatively short. Moreover, since the initial segments are small, they can be broadcast more frequently through their channels than the larger, later segments; as a result, the "access latency"--i.e., the delay in awaiting transmission of the beginning of the first segment (when download can begin)--is also relatively short.
The exponential nature of this data-fragmentation scheme, however, results in large storage requirements for the receivers. For example, the buffer size is usually greater than 70% of the length of the video. Although this is invisible to the viewer, who can begin viewing the video following the access latency period, the hardware costs can be substantial. For example, devices such as hard disks be employed for data buffering; and because each video segment is transmitted at high data rates, the disks must be capable of extremely high storage and retrieval rates.
The "skyscraper broadcasting" (SB) approach represents an attempt to ameliorate some of the drawbacks of PB. See Hua et al., "Skyscraper Broadcasting: A New Broadcasting Scheme for Metropolitan Video-on-Demand Systems," Comp. Commun. Rev. ACM SIGCOMM 27(4) (September 1997). In accordance with the SB approach, the server bandwidth of B Mbits/sec is divided into a series of channels, each representing a separate data stream accessible by the receiver; these may be organized as logical channels in which, for example, data is multiplexed and streamed over a single network communication link, or as separate physical channels (e.g., broadcast at different frequencies over a wireless link). Each of M available videos has a display rate of b Mbits/sec, so the number of channels is B/b, each channel capable of transmitting b Mbits/sec. These channels are allocated evenly among the M videos such that there are K (=B/bM) channels for each video. To broadcast a video over its K dedicated channels, each video file is partitioned into K fragments using a data-fragmentation scheme. Each of these fragments is broadcast repeatedly on its dedicated channel (at b Mbits/sec).
Instead of fragmenting the video files according to a geometric series, as in PB, a series is generated by the following recursive function: ##EQU1## This function is used to assign relative segment lengths, and produces segments whose lengths are such that the receiver need tune into no more than two channels at any time. Its first few terms are f(n)=[1, 2, 2, 5, 5, 12, 12, 25, 25, 52, 52, 105, 105, 212, 212, 425, 425, . . . ]. In order to prevent the segments from becoming too large, a maximum segment size W is defined. That is, if the SB series f(n) would require some segment to be larger than W times the size of the first segment, the size of that segment is restricted to the relative size W. The effect of this restriction is to limit the necessary buffer space, the maximum size of which is ultimately determined by the length of the last segment. (The term "skyscraper broadcasting" refers to the shape the data fragments would form if stacked, with W determining the width of the skyscraper; in PB, the data fragments would form a short, wide pyramid.)
In operation, each fragment is assigned to one of the K channels, and is continuously streamed in a looping fashion over that channel. To receive the broadcast, a receiver subscribes to each channel in turn, downloading data from a new channel only after only after beginning to play the contents of the segment downloaded from the previous channel. Due to the irregular way segments increase in accordance with SB, receivers utilize two separate loading routines: an "Odd Loader" that downloads segments corresponding to the odd terms in the SB series, and an "Even Loader" that downloads segments corresponding to the even terms.
The access latency D.sub.1 for SB is defined as the longest period of time a receiver must wait until it can begin loading the first segment: ##EQU2##
The maximum use of receiver bandwidth occurs when the receiver downloads from two channels while playing a previously cached segment. Accordingly, SB receivers require buffer disks capable of supporting I/O at rates of two to three times the display rate b. The storage requirements of an SB system depend on the lengths of the segments, which begin loading before being played back. The worst case results when the maximum possible amount of the final (largest) segment is cached before its playback begins, and the storage requirement is found to be 60.multidot.b.multidot.D.sub.1 .multidot.(W-1) Mbits.
Although SB represents an improvement over PB, disadvantages remain. In both schemes, segments must be downloaded from beginning to end; that is, before a receiver can begin downloading a segment, it must wait until the segment has looped back to its beginning. Furthermore, the SB series requires that the receiver frequently download data from only a single channel, thereby making inefficient use of the allocated receiver bandwidth. This limitation also restricts the rate at which segments can grow and, as a result, imposes a lower bound on access latency. This is because the size of the first segment--and, hence, the access latency--is determined by the size of later segments.