1. Field of Invention
The invention relates generally to the fields of video and data transmission. In one aspect, the invention relates to the use of a variable or dynamically determined delay function within a session-based video or data streaming application.
2. Description of Related Technology
The provision of session-based services, such as e.g., video on-demand (VOD), is well known in the prior art. In a typical configuration, the VOD service makes available to its users a selection of multiple video programs that they can choose from and watch over a network connection with minimum setup delay. At a high level, a typical VOD system consists of one or more VOD servers that pass and/or store the relevant content; one or more network connections that are used for program selection and program delivery; and customer premises equipment (CPE) to receive, decode and present the video on a display unit. The content is distributed to the CPE over, e.g., a Hybrid Fiber Coaxial (HFC) network.
Depending on the type of content made available and rate structure for viewing, a particular VOD service could be called “subscription video-on-demand (SVOD)” that gives customers on-demand access to the content for a flat monthly fee, “free video-on-demand (FVOD)” that gives customers free on-demand access to some content, “movies on-demand” where VOD content consists of movies only, and so forth. Many of these services, although referred to by names different than VOD, still share many of the same basic attributes including storage, network and decoder technologies.
Just as different varieties of VOD service offerings have evolved over time, several different network architectures have also evolved for deploying these services. These architectures range from fully centralized (e.g., VOD servers at a central location) to fully distributed (e.g., multiple copies of content distributed on VOD servers very close to customer premises), as well as various other network architectures there between. Since most cable television networks today consist of optical fiber towards the “core” of the network which are connected to coaxial cable networks towards the “edge”, VOD transmission network architectures also consist of a mixture of optical fiber and coaxial cable portions.
The CPE for VOD often consists of a digital cable set-top box (DSTB) that provides the functions of receiving cable signals by tuning to the appropriate RF channel, processing the received signal and outputting VOD signals for viewing on a display unit. Such a digital set-top box also typically hosts a VOD application that enables user interaction for navigation and selection of VOD menu.
While the architectural details of how video is transported in the core HFC network can be different for each VOD deployment, each generally will have a transition point where the video signals are modulated, upconverted to the appropriate RF channel and sent over the coaxial segment(s) of the network. Depending on the topology of the individual cable plant, this could be performed at a node, hub or a headend. The coaxial cable portion of the network is variously referred to as the “access network” or “edge network” or “last mile network.”
In U.S. cable systems for example, downstream RF channels used for transmission of television programs are 6 MHz wide, and occupy a 6 MHz spectral slot between 54 MHz and 860 MHz. Deployments of VOD services have to share this spectrum with already established analog and digital cable television services. For this reason, the exact RF channel used for VOD service may differ from plant to plant. However, within a given cable plant, all homes that are electrically connected to the same cable feed running through a neighborhood will receive the same downstream signal. For the purpose of managing VOD services, these homes are grouped into logical groups typically called Service Groups. Homes belonging to the same Service Group receive their VOD service on the same set of RF channels.
VOD service is typically offered over a given number (e.g., 4) of RF channels from the available spectrum in cable. Thus, a VOD Service Group consists of homes receiving VOD signals over the same 4 RF channels. Reasons for this grouping include (i) that it lends itself to a desirable “symmetry of two” design of products (e.g. Scientific Atlanta's MQAM), and (ii) a simple mapping from incoming Asynchronous Serial Interface (ASI) payload rate of 213 Mbps to four QAM payload rates.
“Trick Modes”
So-called “trick modes” are well known in the digital cable television networking arts. Trick modes generally comprise VCR-like commands issued by a user via their CPE during VOD session playback. They are generally implemented as one or more data packet structures that must be sent back over at least a portion of the HFC network (i.e., “upstream”) to the content server, where they are acted upon. Such action typically comprises instigation of the requested action in the video stream (e.g., pause, rewind, fast forward, etc.), as well as the issuance of a response packet acknowledging the user's (CPE) request.
Trick mode packets are typically sent and responded to in a relatively short amount of time; however, this time can vary significantly based upon, e.g., network and server load, as well as other factors. Such load (and other factors) may vary somewhat predictably as a function of time or other parameters, yet may also be largely unpredictable under certain circumstances. For example, network load within a given service or geographic area may increase predictably during prime time viewing hours, yet may increase unpredictably due to other unforeseeable events such as the onset of inclement weather in that area, causing more subscribers to stay inside and view OD or similar content.
Prior art trick mode implementations compensate for these variations with a static or constant value that is subtracted from the desired play-back time (NPT) which does not always cover the actual delay in a loaded system. For example, consider the case illustrated in FIG. 1 (and Table 1 below) where the user initiates the command at time T=0, and there are A milliseconds (msec.) of user response time, B msec. of CPE processing delay, and C msec. of packet (e.g., LCSP command) generation and queuing delay through the network, commonly referred to as “NetIO” (e.g., packet and queuing delay from a serving network node back to the relevant service point within the network, such as a VOD server or the like). The actual de-queuing, decoding and generation of an appropriate (e.g., LCSP) response at the service point may take another D msec. to which is added another E msec. of processing delays at the serving node, and F msec. of processing time at the CPE upon receipt of the signal. These various delays (A-F) are also added to the delay in the user perceiving the requested change in the transport stream content (P), and propagation delays (Z) within the network, although the propagation delays are generally predictable and constant for a given signal path. Hence, the total time (T) from the user deciding that a command is to be implemented to their perception that that the command has been implemented in this example is T=A+B+C+D+E+F+P+Z msec. Since at least a portion of this time (i.e., the C, D and E segments) may vary as a function of network or server loading or yet other factors, the total response time is variable.
TABLE 1ItemDescriptionTypical ValueAUser response time250 to 333 msec.BSTB response time<1 msec.CPacket TX queue delayvaries w/network trafficDServer RX, process, & TX timevaries w/server loadEPacket RX queue delayvaries w/network trafficFSTB processing time<1 msec.Using the aforementioned prior art “static” approach, the constant correction value (e.g., 333 to 500 msec. typically) is subtracted from the normal play time (NPT) coordinate, in order to account for delays inherent in the system associated with servicing the trick mode command. As is well known, NPT is a value associated with a temporal content stream or the like which advances in real-time in the “normal” play mode, yet which advances faster than real-time when the stream is fast-forwarded such as via a trick mode function. Similarly, NPT will decrement when the rewind function is invoked, and is fixed when the stream is paused. The prior art approach of a static compensation value at least in theory provides a purposely oversized window within which any fixed and variable delays will fit. Stated differently, the actual (including variable) delays of the system should in theory be smaller in magnitude than the subtracted constant value.
However, it is sometimes the case where such delays do not fall within this window (such as during high loading periods), and hence an undesirable and highly perceivable latency in trick mode operation is created. This problem can be unpredictable, and may also manifest itself in the inaccurate execution of the requested command; e.g., a “rewind” command may place the viewer at a point in the content stream different from that which they desires by over-rewinding or under-rewinding the stream. This creates somewhat of a self-perpetuating (and potentially uncontrolled) load excursion within the system, since when a given user perceives that their trick mode command has not been (properly) serviced, they will almost invariably make a second attempt to invoke the same or a compensatory command, such as by pressing the same (rewind) button on their remote or DSTB one or more additional times to get to the desired point in the stream, or alternatively by fast-forwarding where their first command result in an “over-rewind” condition that placed them back too far in the stream.
Hence, extra (and unnecessary) trick-mode commands are issued by the user's CPE to correct the “missed” stream segment target, which loads an already heavily-loaded system even more. When multiplied by a potentially large number of OD or other session-based users within a network at any given time, these additional commands may cause significantly heavier network traffic, which further exacerbates the problem. In extreme situations, the serving network apparatus may appear totally unresponsive to the user's repeated commands, or be executed with such inaccuracy, so that the user becomes extremely frustrated.
Various other approaches to providing the “trick mode” functionality discussed above are in evidence in the prior art. For example, U.S. Pat. No. 5,477,263 to O'Callaghan, et al. issued Dec. 19, 1995 entitled “Method and apparatus for video on demand with fast forward, reverse and channel pause” discloses a video distribution system, wherein methods and apparatus for channel selection are implemented to reduce the channel-to-channel latencies which might otherwise occur in video decoding systems, such as MPEG-2. In addition, methods and apparatus are implemented for providing fast forward, fast reverse and channel pause functionality when utilizing staggered start times for a particular program source.
U.S. Pat. No. 5,963,202 to Polish issued Oct. 5, 1999 entitled “System and method for distributing and managing digital video information in a video distribution network” discloses a video distribution network system that includes client configuration data, a client video buffer for storing video information, a client video driver coupled to the client video buffer for presenting a portion of the video information on a display device, a current status manager for determining current client status information indicative of the portion of video information presented, a computations engine coupled to the client video buffer and to the current status manager for forwarding a burst of video information to the client video buffer based on the client configuration data and on the client status information, and a video buffer controller coupled to the client video buffer for controlling storage of the burst in the client video buffer.
U.S. Pat. No. 6,020,912 to De Lang issued Feb. 1, 2000 entitled “Video-on-demand system” discloses a video-on-demand system comprises a server station and a user station. The server is adapted to transmit a selected television signal with operating data defining a selected one of various available sets of playback modes (normal, fast forward, slow forward, rewind, pause, etc.) in response to operating signals from the user station indicating the selected set of playback modes. Operating data which define the various available operating signals (the user interface) are fixed in the server and are transmitted by the server to the user station. Downloading of different sets of the user interface at different prices is possible. For example, a television program with commercials may be offered at a higher price if it includes the facility of fast forward during commercials.
U.S. Pat. No. 6,434,748 to Shen, et al. issued Aug. 13, 2002 entitled “Method and apparatus for providing VCR-like “trick mode” functions for viewing distributed video data” discloses an apparatus for providing VCR-like “trick mode” functions, such as pause, fast-forward, and rewind, in a distributed, video-on-demand program environment. Trick modes are supported by locally altering the viewing speed for each user who requests such functions, without affecting the operation of the central data source in any way. Thus, a number of viewers are ostensibly able to enjoy random access to video programming, including virtually continuous trick mode functionality.
U.S. Pat. No. 6,577,809 to Lin, et al. issued Jun. 10, 2003 entitled “User selectable variable trick mode speed” discloses user selection of a particular trick mode, wherein the number of pictures that are displayed can be adjusted to correspond with the selected trick mode speed based on a determined display time. Subsequently, the bandwidth usage can be can be determined to ensure that the channel capacity between a playback device and a remote decoder has not been exceeded. For forward trick modes, in a case where the bandwidth between the playback device and the remote decoder would be exceeded, B-pictures can be uniformly eliminated throughout the playback segment. Where B-pictures were present and they have been eliminated, they can be replaced by dummy B-pictures. Again, if there is still insufficient bandwidth available between the playback device and remote decoder, then P-pictures can be eliminated from the playback segment and uniformly replaced by dummy P-pictures.
U.S. Pat. No. 6,751,802 to Huizer, et al. issued Jun. 15, 2004 entitled “Method of transmitting and receiving compressed television signals” discloses the transmission of MPEG encoded television signals from a Video-On-Demand server to a receiver via a network. Non-linear playback functions such as ‘pause’ and ‘resume’ require accurate control of the bit stream, taking account of typical network aspects such as network latency and remultiplexing. In order to allow the receiver to resume signal reproduction after a pause, position labels are inserted into the bit stream at positions where the server can resume transmission of the signal after an interruption. Upon a pause request, the decoder initially continues the reproduction until such a position label is detected. The subsequent bits delivered by the network are ignored, i.e. they are thrown away. Upon a request to resume reproduction, the receiver requests the server to retransmit the signal starting at the detected position. See also U.S. Pat. No. 5,873,022 to Huizer, et al. issued Feb. 16, 1999 entitled “Method of receiving compressed video signals using a latency buffer during pause and resume”.
U.S. Patent Publication No. 20030093543 to Cheung, et al. published May 15, 2003 entitled “Method and system for delivering data over a network” discloses a method and system for delivering data over a network to a number of clients, which may be suitable for building large-scale Video-on-Demand (VOD) systems. The method utilizes two groups of data streams, one responsible for minimizing latency while the other one provides the required interactive functions. In the anti-latency data group, uniform, or non-uniform or hierarchical staggered stream intervals may be used. The system based on this invention may have a relatively small startup latency while users may enjoy some interactive functions that are typical of video recorders including fast-forward, forward-jump, and so on. See also U.S. Patent Application 20030131126 published Jul. 10, 2003.
U.S. Patent Publication No. 20030228018 to Vince published Dec. 11, 2003 entitled “Seamless switching between multiple pre-encrypted video files” discloses a video on demand (VOD) system, including methods and apparatus for switching back and forth between two pre-encrypted files having changing encryption keys. Such switching back and forth may be required when a VOD server stores both a “normal” copy of a movie and a “special” copy such as a “trick-play” version for, e.g., fast forward and rewind effects. Instead of using keys with changing parities in both streams, the special stream is encrypted with keys using the same parity (even or odd), while the normal stream is encrypted with one dynamic key (odd or even) and one fixed key (even or odd).
“Efficient Delivery Of Streaming Interactive Video (Multimedia, Video On Demand)”, Dey, J.; Vol. 59-02b, University of Massachusetts, 1998 discloses video delivery systems that deliver data from a server to clients across a high speed network. In particular, the problem of constant-bitrate (CBR) transport of variable-bit-rate (VBR) video is addressed. An algorithm is disclosed to compute the minimum client buffer required to transmit a VBR video using CBR transport, and it is noted that delaying the playback of a VBR video can reduce the CBR transmission rate and client buffer requirements. Provisioning of VCR like functionalities of Fast Forward, Rewind and Pause, in the context of delivering CBR video is also analyzed, and a system disclosed whereby VCR functionalities are provided with statistical guarantees ostensibly ensures that clients receive a desired quality service while resources at the server are used more efficiently. Algorithms to restart playback at the end of an interactive operation are also disclosed.
“Support For Service Scalability On Video-On-Demand End-Systems (Quality Of Service)”; Abram-Profeta, E. Vol. 59-07b, The University of Michigan, 1998 discloses use of a coarse-grained striping scheme in disk arrays in the context of a VOD system, as well as the use of clustered disk-array-based storage organizations to reduce service disruptions during content updates. For a given storage organization, scalability can be improved by batching customers' requests and using multicast communication. For instance, in “Near-VoD” (NVoD), videos are sourced at equally-spaced intervals, thus allowing limited VCR functionality and guaranteeing a specified maximum admission latency. A methodology to measure scalability and identify scalable alternatives to NVoD is also presented
“Synchronization schemes for controlling VCR-like user interactions in interactive multimedia-on-demand (MOD) systems”; Chian Wang; Chung-Ming Huang, Comput. J. (UK), VOL. 47, NO. 2, Oxford University Press for British Comput. Soc., 2004 discloses control schemes that are based on the feedback-adaptive mechanism. Based on the control schemes, a multimedia-on-demand system in which text, image, graphics, audio and video media are transmitted from the server site to the client site through text, image, graphics, audio and video communication channels respectively, is described.
While the foregoing citations illustrate a broad variety of different prior art techniques for providing trick mode functionality, none specifically address the issue of variable trick mode command latency/inaccuracy in an effective and easily implemented fashion; i.e., without the need for significant modifications to the existing network infrastructure and/or the installed CPE base. Accordingly, improved methods and apparatus for managing variable network latencies associated with trick modes and other similar network service functions are needed. Such improved methods and apparatus would ideally be implemented with only slight modifications to the extant infrastructure and installed CPE base, and would significantly reduce network loading (and eliminate uncontrolled excursions in load). These techniques would also be transparent to the user, such that the dynamic compensation occurs seamlessly without the user perceiving its operation (other than an increased level of user satisfaction from more accurate command functionality).