In network environments such as the Internet, video data is typically transmitted by a web server (or other server, such as a streaming server) upon request by a personal computer. The software component on the personal computer that often requests the video data is a video player (e.g., Adobe Flash Player, etc.) embedded in a web browser (e.g., Internet Explorer, etc.) and the web server typically transmits the video data to the video player in a known video encoding (compressed) format such as MPEG-4 or H.263. Once the video player receives the transmitted video data from the network, it decodes (e.g., decompresses) the video data for rendering on the display of the personal computer. Often, the video player leverages specialized graphics hardware on the personal computer such as a graphics processing unit (GPU) to accelerate the processing required to decode and render the video data on the display.
In many cases, to enhance a user's viewing experience, the video player additionally incorporates or “composites” a user interface (UI) of video player controls or other graphical elements as transparent or semi-transparent overlays on top of the rendered video frames from the received video data. The video player performs such compositing after decoding the received video data, for example, by utilizing “alpha blending” techniques to combine pixel values of the UI elements with pixel values of a frame of video data (e.g., upon their decoding) in order to construct final composite video frames for display.
With the rise of technologies such as server based computing (SBC) and virtual desktop infrastructure (VDI), organizations are able to replace the traditional personal computers described above with instances of desktops that are hosted on remote desktop servers (or virtual machines running thereon) in a datacenter. A thin client application installed on a user's end terminal (e.g., laptop, PC, thin client device, etc.) connects to a remote desktop server that transmits a graphical user interface (GUI) of an operating system session for rendering on the display of the end terminal. One approach to such a remote desktop server system is VMware View, in which each user desktop operating system (e.g., Windows) is implemented in a separate virtual machine hosted on a server residing in an organization's datacenter. A remote display protocol such as Remote Desktop Protocol (RDP) or PC-over-IP (PCoIP) is implemented within the thin client application on the user's end terminal as well as within the corresponding virtual machine running the user's desktop (e.g., as a service running in the operating system of the virtual machine, etc.) that enables the virtual machine to transmit the desktop's GUI display for rendering on the user's end terminal.
In such “desktop virtualization” environments, a video player playing a video, as previously discussed (e.g., Adobe Flash Player in a web browser), would be executing within a virtual machine hosted on a server in the organization's datacenter despite the video itself being ultimately displayed on the user's end terminal (i.e., the video data must be additionally transmitted via the remote display protocol to the user's end terminal). However, decoding of received encoded video data, as is typically performed by the video player, as well as the additional task of transmitting the decoded video data over the network from the virtual machine to the user's end terminal for display can consume significant network bandwidth and computing resources, which could have otherwise been allocated to other virtual machines in the datacenter generally. In order to alleviate such network and computing resource pressure on a datacenter server, the remote display protocol service running within a virtual machine may be configured to intercept the video player's requests (e.g., to the operating system and/or specialized graphics hardware of the server) to decompress and display video data and, in turn, transmit the still-encoded video data to the thin client application on the user's end terminal. Upon receipt of the still-encoded video data, the thin client application on the user's end terminal may be configured to decode the video data so that it can be rendered on the end terminal display. Furthermore, the end terminal may include its own GPU to assist the thin client application in decoding the received video data.
Although the foregoing technique alleviates pressure on the network and computing resources of the virtual machine by passing responsibility for decoding video data from the video player running in the virtual machine to the thin client application on the user's end terminal, it also prevents the video player from compositing transparent or semi-transparent UI elements into the video data, for example, by using alpha-blending techniques, since such alpha-blending requires the video player to have access to decoded video data.