By the end of 2014, video traffic accounted for 55% of total mobile data traffic and is projected to increase to 72% by 2019. Given the ubiquitous availability of portable mobile devices for image and video capture, there has been a dramatic shift towards over-the-top video streaming. On YouTube alone, half of the billions of daily video views are already on mobile devices. Due to limits on wireless network capacity, yet with an increasingly knowledgeable base of consumer users demanding better quality video display services, accounting for an end user's quality of experience (QoE) has become an essential measure of network performance. QoE refers to a viewer's holistic perception and satisfaction with a given communication network service.
Network impairments or bandwidth limitations can cause volatile network conditions, resulting in rebuffering or stalling events, which interrupt a video's playback. An example of a stalling event in a video is illustrated in FIG. 1, and various stalling events occurring intermittently in a video are illustrated in FIG. 2. Such network-induced stalling events can negatively impact a viewer's satisfaction with cellular network service quality and can lead to user attrition.
Designing an objective model that can accurately predict an end user's QoE can help cellular network providers better understand the effect of rebuffering events on viewer behavior. Such models can also assist in the reduction of network operational costs by encouraging the design and deployment of efficient “quality-aware” network solutions. These solutions can control and monitor the quality of streaming video content to help yield optimal user QoE. However, an end user's QoE is highly subjective and is the result of a combined effect produced by diverse, realistic stalling patterns, which occur at different locations and can be of varied frequencies and lengths on different types of video content. Therefore, designing a QoE model that accounts for such subjective factors but still correlates well with human opinion scores is highly challenging. Most of the existing models are designed on ad hoc video datasets and extract global stall-informative features, such as the sum of the lengths of all the stalls, number of stalls, and so on. These models may use simple learning engines to train a QoE predictor. However, the generalizability of these simple models becomes questionable with regards to their performance on other video datasets, which contain more diverse video content and more realistic stall patterns. The public unavailability of the details of many of these models and of the corresponding training data used poses additional challenges with respect to benchmarking these models on videos afflicted by realistic stalling patterns.
Another crucial disadvantage of all existing QoE prediction models designed based on global stall-informative features is the following: these models fail to capture the time-varying subjective quality of a viewer's opinion on a given video as it is being played and viewed. Aside from factors, such as the number and frequency of stalls and spatial and temporal distortions, QoE also depends on a behavioral hysteresis or recency “after effect,” whereby an end user's QoE watching a video at a particular moment depends on the viewing experience before the moment. For example, a previous unpleasant viewing experience caused by a stalling event tends to penalize the QoE in the future and thus affects the overall QoE. A stalling event occurring towards the end of a video sequence has a more negative impact than a stall of the same length occurring at a different position in the same video sequence. This dependency on the previous viewing experience is often nonlinear in nature and can be crucial in determining the overall quality of experience of users, but it is not being captured by the state-of-the-art QoE prediction models.