The consumption of streaming and media content is moving from traditional broadcast mediums, such as cable and satellite, to Internet based consumption. Special purposed machines, commonly referred to as streaming servers, are responsible for any or both of the intake and delivery of the streaming content. These servers have sufficient capacity to intake and deliver multiple content streams simultaneously.
Content publisher experience in publishing content to a streaming server and content consumer experience in streaming or downloading the uploaded content from the streaming server is dictated in large part by the streaming server's capacity. If the server becomes overutilized from too many simultaneous uploads and/or downloads, then content publishers attempting to upload content to the server may experience various failures or delays while publishing the content and content consumers requesting content may fail to receive their requested content or may receive the requested content with significant delays, buffering, or lowered quality. The degraded server performance is also unlikely to be isolated to a single content publisher or content consumer, but rather propagated to affect all such users that simultaneously attempt to use the server during a period of overutilization.
When the user experience becomes impaired, users including content publishers and content consumers will likely stop their use of the server or server platform altogether. This creates a snowball effect, whereby content publishers will not upload their content to the server, because the content cannot be delivered efficiently, and content consumers will not request or download content from the server, because content publishers have not uploaded content thereto. This problem is exacerbated when the publishing and delivery is performed by a distributed platform, such as a content delivery network (CDN).
A typical CDN has a distributed server footprint. The CDN establishes various points-of-presence (PoPs) at various geographic regions with each PoP having one or more servers that handle the publishing from and delivery to users (i.e., content publishers and content consumers) geographically proximate to the PoP. The distributed server footprint creates the potential for multiple different points of degraded performance or failure. For instance, a CDN can have PoPs in Los Angeles, New York, and Dallas with the New York PoP experiencing the greatest loads and becoming overutilized. As a result, content publishing and/or delivery performance to users geographically adjacent to the New York PoP can suffer, leading to a poor user experience, and users ultimately leaving the CDN platform even when the Los Angeles and Dallas PoPs provide acceptable performance.
With sufficient warning, the CDN can avoid the situation of a PoP or servers of a PoP becoming saturated. With sufficient warning, the CDN can shift the load from a potentially overloaded PoP to one of the other PoPs with excess capacity. Alternatively, the CDN can dynamically allocate additional resources to increase capacity at the overloaded server or PoP. The warning is the result of understanding the capacity of each streaming server in each PoP and the current load on each streaming server.
Load testing identifies capacity of a streaming server. Load testing involves a controlled test environment where the load on the streaming server is gradually increased until encountering a point of failure or performance degradation. Traditionally, these tests have been one dimensional in that they involve publishing different instances of the same content stream until the failure or performance degradation occurs or requesting different instances of the same content stream until the failure or performance degradation occurs.
In the real world however, the streaming server will intake different content streams from different content publishers while simultaneously delivering different content streams to different content consumers. Each content stream published to the streaming server and being downloaded from the streaming server imposes a different load on the streaming server. As some examples, the different content streams may be encoded and delivered at different bitrates, thereby requiring different processor and memory resources for publishing and delivery; the different content streams may be of different durations that lockup resources for longer or shorter periods of time; the different content streams may be uploaded and downloaded using a variety of different streaming protocols, each with different overhead and resource utilization; and the streaming server may experience different content stream upload to download ratios at different times with the uploaded and downloaded content streams consuming different amounts of resources.
Accordingly, static load testing methodologies and systems of the prior art do not adequately account for the dynamically changing loads that can be placed on a streaming server. The static load tests may identify one scenario where the streaming server may become overutilized, but fail to identify several other scenarios that may also occur as traffic conditions and streaming server usage changes. Consequently, the static load tests may fail to identify when the streaming server is on the verge of overutilization which would then cause actual performance degradation or failure.
There is therefore a need for dynamic load testing of a streaming server. Specifically, there is a need to automate different load scenarios in which the streaming server can become overutilized and intelligently increase and modify the loads to achieve a state of automated testing. There is further a need to leverage the dynamic load test results for real-time health check applications. In other words, there is a need to correlate the dynamic load test results with actual real-time loads experienced by the servers in order to identify instances where server saturation is imminent. There is further a need to incorporate the dynamic load testing as part of a dynamic resource allocation methodology and system such that when any server within a distributed platform nears overutilization, the methodology or system can dynamically allocate more resources for servicing the loads on that server or restrict further load, thereby avoiding overutilization in the real world.