In order to deliver (or stream) real-time or time-based data from a server system to an end-user system, a number of system resources must be tightly managed. Typically, a video server system comprises video server hardware and software while an end-user system refers to a set-top box and TV, Personal Computer (PC), or other user device. Resources that must be tightly managed include Input/Output (I/O) resources such as disk drive (or other storage media) space and disk drive (or other storage media) bandwidth, CPU resources, memory, and network bandwidth.
Real-time and/or time-based media streaming, such as video streaming or video-on-demand (for example, movie, music, or other multi-media on-demand on a settop-box or other device connected to a television set or other receiver) is an extremely cost-sensitive business. Because of the bandwidth required to deliver a high quality video stream (typically 3 to 8 Megabits/second/user), these applications place tremendous load on the video server's memory, disk (or other storage media) and network subsystems. When such an application scales from serving a few users (for example, tens to hundreds) to very large numbers of users (for example hundreds of thousands or millions), the total solution cost, using today's technologies become cost-prohibitive. Business economics for example may initially benefit from a small low cost system that can service a limited number of users or subscribers. As the number of users or subscribers grows the initial system is augmented to add additional capacity. Desirably the initial system is retained and the initial system architecture is retained and scaled to serve the larger set of users.
Typical video-on-demand deployments start small and grow. A small server system capable of serving a few hundred users eventually must become part of a larger system that serves hundreds of thousands. Heretofore, there have generally been two approaches that have been taken to address this system size or system capacity scaling problem: (1) Deployment and use of tightly-coupled multiprocessor systems delivering a large number of streams, and (2) Loosely coupled clusters that are composed of small, off-the-shelf computers, but connected using standard computer networks.
Examples of these types of configuration are described relative to FIG. 1 and FIG. 2. With reference to FIG. 1, there is illustrated a portion of one embodiment of a tightly-coupled multiprocessor system, server 50, delivering a large number of streams. Server 50 has the capacity for a large number of processors, usually embodied as processor boards. Accordingly, server 50 comprises a plurality of slots, such as slots 60, 62, 64, 66, and 68. In one embodiment, server 50 has 256 slots, and is therefore capable of comprising 256 processor boards. Typically, server 50 begins service with a few processor boards, such as boards 70, 72, and 74, and boards are added as the system grows. Such a system tends to be very costly and does not usually meet the strict cost constraints placed by business. There is also the potential for failure of one board, such as processor board 72, to cause total failure of server 50. Further, as the system grows, the cost of computational power decreases, and the processor boards required to update the system may be outdated by the time a system administrator is prepared to grow the server system.
Examples of the loosely coupled clusters that are composed of small, off-the-shelf computers, but connected using standard network may for example use Gigabit Ethernet or Fiberchannel networking and use software to manage the collection of systems as a single entity capable of meeting some scalability and quality of service requirements. An exemplary system according to this loosely coupled cluster concept is illustrated in FIG. 2. FIG. 2 depicts servers 80 and 82 operating together as a cluster, receiving requests from load balancer 79 (a Layer 4 switch). Servers 80 and 82 each have access to all assets—including asset 86, asset 88, and asset 90 through fiber-channel switch 84. The shared storage includes additional components—fiber-channel switches, switch adapters, disks that are fiber-channel capable, etc. All are additional cost components and add complexity to the scalability of the network.
In addition, the shared storage cluster shown in FIG. 2 does not solve the resource management problem. For example, a video stored on a disk attached to a shared fiber channel switch still has its limitations on the amount of bandwidth available from the disk or through a fiber channel link. Thus, if a particular asset, or video, becomes in high-demand or is “hot” (where a lot of subscribers are requesting the video simultaneously and exceeding any disk's capacity to serve it or any one server's capacity in terms of disk or network bandwidth, to serve it), additional mechanisms are required to handle it. Many conventional systems attempt to copy high-demand or ‘hot’ assets onto switch memory or server physical memory 84 for faster access. However, these schemes fail beyond a certain size file or asset, as the system resource requirements become prohibitive for large video files.
Further, conventional load balancing handles requests from client devices and spreads them across to various servers to effectively balance network bandwidth as well as connection overheads (usually in software). However, the present solutions fail to take into account the I/O problem—the problem that happens at the I/O subsystem where contention for a video file or for storage system video file retrieval bandwidth causes the disk subsystem to run out of resources.
This input/output problem is endemic to any time-based media (such as audio and video) and real-time content delivery, and is especially true for “high-quality” or “high-value” video content. For example, a typical movie for a movie-on-demand application generally needs to be delivered at 4 Mbps to 8 Mbps today and up to 20 Mbps for a high-definition (HD) system and over a period of 90 to 120 minutes. For such an application, continued availability of resources—such as disk or other storage subsystem bandwidth, memory, network bandwidth, and CPU resources—over a long period of time is required to deliver a video service. Customers simply will not subscribe to a paid service to see a full length movie at lower than broadcast quality and may not even be inclined to subscribe unless the movie is the quality of a DVD or equivalent movie.
This is in contrast to existing load balancing/cluster systems for solving computational problems or data delivery problems (such as serving web pages from a server cluster at an aggregation site). Computational clusters usually tax the disk subsystems very little whereas data clusters for non-time-based data (such as graphics images or web pages) tax the disk subsystem, but they do not have real-time delivery semantics associated with them. For example, users will generally tolerate parts of a web-page loading slowly whereas breakups in audio and video are considered less tolerable or intolerable. Subscribers simply will not subscribe to a video (movie) delivery service where the play is broken or erratic in time, or the required frame-rates (typically 24 or 30 frames/second) cannot be maintained.
A single copy of a video on a server's disk subsystem can only service a certain number of concurrent play requests. This number is typically limited to by the hard disk's bandwidth. For example, if a disk provides 30 Megabytes of bandwidth for read/write access, it implies that it can support delivery of videos encoded at 5 Megabits/second to 48 users concurrently ((30 Megabytes×8 bits/byte)/5 Megabits/second=48 per second). Striping techniques, where a file system is built on top of a number of such disks, increase the number of concurrent users. However, there is an upper limit to the number of concurrent users the subsystem can server. When a video (or other content) becomes “popular”, more copies of that video need to be provided to increase the concurrent number of plays available given the disk drive bandwidth. (Note that this disk drive bandwidth requirement is entirely different from disk drive storage capacity.) If the relative popularity of the video is known, a predetermined number of copies can be provided. However, dynamic spikes in interest or demand for a particular video movie or other real-time deliverable video content item may occur in a real-time streaming system.
Accordingly, there is a need in this art for a scalable server system, method, architecture, and topology that is able to cost-effectively, timely, and easily increase the number of users serviceable. Such a system should be viable for time-based media delivery, including streaming of broadcast, DVD, and HD movie quality video.
There is a further need in this art for a server system, method, architecture, and topology capable of managing system resources and load balancing to effectively provide real-time asset streaming, including streaming of broadcast and DVD movie quality video assets. Management of resources would extend to disk management, CPU management, memory management, and network bandwidth management.
There is still a further need in this art for a server system, method, architecture, and topology capable of dynamically adjusting to content delivery service demand in a real-time system. That is, a server system capable of automatically and dynamically increasing its capacity for playing out a specific asset, such as a specific video movie, when demand for that asset increases.