Remote computing gives a computing system the capability to serve operating system-based applications from the computing system to remote devices. FIG. 1 generally illustrates how remote computing operates between a server and a client device. Server 10 and client device 20 (sometimes called an “endpoint”) communicate over any network connection 30, whether wired or wireless. In one embodiment, this “remoting” communication is provided through a specific protocol, such as Microsoft Corporation's Remote Desktop Protocol (RDP) or another protocol, running over a connection, such as Transmission Control Protocol (TCP), utilizing Microsoft Corporation's Terminal Services to manage the connection. The connection in this embodiment maybe referred to as the RDP connection. Microsoft Corporation's Terminal Services creates a virtual environment that represents all resource needed to present user interface data to a client device 20 and process user input from the client device 20. This virtual environment is also known as session 11.
When using RDP and Terminal Services, in order for display information to be displayed on the client device 20, on the server 10 RDP uses its own video driver to render the display output by constructing the rendering information into network packets using RDP protocol. These packets are then sent over the network connection 30 to the client device 20. On the client, RDP receives rendering data and interprets the packets into corresponding graphics device interface API calls. For the input path, client mouse and keyboard events are redirected from the client to the server.
Thus, more generally, application 15 executes on the server 10 in a session 11. User interface data 40, representing data to be presented on client device 20 in connection with session 11 representing application 15, is transmitted to the client device 20. This user interface data 40 can include media data (e.g. a recorded video presentation) and/or user control data (e.g. a menu for controlling a recorded video presentation).
Further more a session 11 can represent the set of applications that will be present on the client device 20. For any Server 10, there can be multiple sessions presenting user interface data to client devices. In this embodiment, the session which represents the set of user experience to be rendered on the client is managed by the server; however, one can imagine an embodiment in which this management is done on the client.
The user interface data 40 is then rendered or displayed, e.g., on display 25 on client device 20. While a display is discussed and shown in FIG. 1, any presentation of a user interface, e.g. by visual or audio displays or otherwise, may be used.
The user of client device 20 who is viewing a menu, for example, displayed as part of user interface data 40, can respond (e.g. in order to perform operations in connection with server 10 as if the application 15 were running locally). This is done via input 45 to client device 20 respecting application 15, which is transmitted back to server 10.
The input 45 is received by the remote computing server software on the server 10, and the operation is performed on server 10 on behalf of client device 20, possibly changing the user interface data 40 which is to be displayed or otherwise presented on client device 20.
In this way, the user input helps control the transmission and presentation of the user interface data 40. As discussed above, media data is transmitted and presented as part of the user interface data 40 on client device 20. The user interface data 40 presented on client device 20 creates what is termed a media experience. The media experience unifies the different information (e.g. media data providing, e.g., video and audio displays, and user control data to control the presentation of media data) presented on client device 20. Thus, for example, menus used to control different types of media data may be coordinated in order to increase usability and the general aesthetic appeal of the media experience.
Multiple media experiences can each be instantiated and received by respective endpoint client devices. Each media experience is controlled by at least one server. User interfaces data presented on client device 20 can include graphics that typically compose a user control interface. Other non-graphical control data may also be presented, e.g. audio data dealing with user control. In order to control the media experience, typical actions that a remote user may desire to carry out via the user interfaces include commands over media data, such as stop, fast forward, and rewind. In addition, the user may be provided with controls to perform conventional computer commands to enable actions such as resizing replay windows, adjusting volume, and adjusting picture quality. User input may be provided via, e.g., a keyboard connected to the client device 20, via a remote associated with client device 20, or via any other input means.
As discussed, media data is also presented as part of a media experience. Media data consists of presentation data for presentation on the client device 20. The following is a nonexhaustive list of exemplary media data which may be included in a media experience: a streaming media presentation, including video and/or audio presentation(s), a television program, including a cable television (CATV), satellite, pay-per-view, or broadcast program, a digitally compressed media experience, a radio program, a recorded media event (sourced by a VCR, DVD player, CD player, personal video recorder or the like), a real-time media event, a camera feed, etc. The media data may be in any format or of any type which can be presented on client device 20, such as music (formatted as MP3s, WMVs, etc.), streaming audio/video, photos (formatted as JPEGS, GIFs, etc.), movie files (formatted as MOVs, MPEG, etc.), advertisements, broadcast media (radio, TV, cable, etc.), graphics data, etc.
Thus, a user with local PC located in a home office could use that PC to watch a streaming video program from the Internet on a television (a first remote endpoint device) in the family room. Moreover, using the same PC, a second user could simultaneously watch on another television set (a second remote endpoint device presenting second media experience) a video stored on the local PC. It is noted that these scenarios can be further extended to a myriad of circumstances. For instance, a third user could simultaneously observe a camera feed inputted into local PC that is remoted to a third remote endpoint device. A fourth user could use local PC to remote a fourth instantiation of a media experience to watch a remoted television program on a monitor (also an endpoint device) that does not include a TV tuner.
Because, as discussed above, the media experience is intended to enable a simple, rich user interface that integrates media data along with the user control functionality necessary to control the media data presentation, it is important that the media experience be protected from unauthorized presentations of user interface data. Such unauthorized presentations may be derived from an attack by a hacker or other adversary, attempting to interfere with or preempt all or part of the media experience, either via the server 10 or via the network connection 30. Additionally, such unauthorized presentations may be a result of rogue software on server 10. While the software application(s) which are intended to control the presentation of the media experience on the client device 20 can be programmed to function to provide the media experience according to some predetermined plan, providing the aforementioned simple, rich user interface, there may be other software on server 10 which attempts to provide user interface data 40 for display on the client device 20. Where such displays do not conform to the unified media experience intended, this will interfere with the aforementioned goals for consistency, usability and aesthetic appeal of the media experience.
Generally, where remoting is not being performed, one method in which unauthorized processes can be prevented from performing unauthorized activity is to examine each process and verify that it is authorized. One such verification technique is to have authorized applications be verifiable through a digital signature. Thus, for example, before a process is allowed to perform an activity, the image of the executable associated with the process is examined to determine if it is digitally signed by an acceptable authority. Only if it is so signed is the process allowed to perform the activity. Alternatively, when the determination is made that a process is not properly signed, that process may be terminated.
However, using this technique to prevent unauthorized activity on a remote client device session 11 presents several disadvantages. Firstly, the server 10 may not include the processing power to analyze every process and determine whether it is verifiable. If many processes produce traffic which is detected by the server 10, this may cause performance problems. Secondly, where the technique requires that unverifiable processes be terminated, and where it is possible to allow each remote client device session 11 to terminate processes, permitting them to do so may lead to instability if verification in a remote client devices session 11 terminates processes as unauthorized which are used by the server 10 under some alternate policy.
It would thus be desirable to have a technique to restrict the presentation of user interface data on a remote device to authorized processes, while overcoming drawbacks such as those described above. The present invention addresses the aforementioned needs and solves them with additional advantages as expressed herein.