These days, the level of security built into a graphics system is an important feature, since many new applications of graphics systems require that the graphics system be able to protect confidential data. Examples of such new applications include medical imaging (confidential information is displayed on the screen regarding a patient's health), video editing, video playback, digital signage and even general processing (the graphics processor unit (GPU) is a highly parallel processor and may be used to efficiently execute some general mathematical code).
In these new applications, the access to the data processed by the graphics processor and stored in the video memory (or the data used during the processing and stored in the video memory) may need to be protected. In order to address this problem, various techniques have been devised to protect data during its transfer to or from the graphics system of a computing platform, as well as within the graphics system itself. Unfortunately, these protection techniques are often very complex and may be associated with high computation requirements and/or the excessive use of resources of the graphics system, which can ultimately degrade the performance of the graphics system, as will be discussed below.
FIG. 1 shows an exemplary prior art graphics system 100 that includes a graphics processor 102 (also referred to as a graphics pipeline), a memory ring bus 104, a memory controller 106, a video memory 108 and a video processing engine 116, among many other components. A bus interface 110 supports an external bus 112 that is used by a host computer (not shown) to connect with the graphics system 100. The data processed by the graphics processor 102 is stored in the video memory 108. In operation, data exchanges often occur between the host computer's system memory and the video memory 108, where the data is sent over the bus 112. During a data transfer over the bus 112, it is possible for devices connected to the bus 112 to “listen” to and copy the data as it is being transferred.
The configuration of the graphics system 100 shown in FIG. 1 is one in which a memory ring bus 104 interconnects, and thus is shared by, the various components of the graphics system 100, including the bus interface 110 and the memory controller 106. Another common configuration of the graphics system 100 is one in which the various components of the graphics system 100 are directly connected to both the bus interface 110 and the memory controller 106, thus precluding the need for a memory ring bus 104.
An optional component of the prior art graphics system 100 is the copy engine 114, which is responsible for copying data between the system memory of the host computer and the video memory 108. More specifically, the copy engine 114 executes copy instructions, each copy instruction being to copy a range of data from system memory to video memory 108 or from video memory 108 to system memory. In doing so, the copy engine 114 requests reading a range of data from either system memory or video memory 108 and then requests writing this range of data to either video memory 108 or system memory. Since the copy engine 114 is dedicated to this copying functionality, the speed and efficiency of the graphics system 100 are increased without placing any undue extra burden on the other components of the graphics system 100.
In a graphics system 100 that does not include a copy engine, the graphics processor 102 may be responsible for copying data from the system memory of the host computing system to the video memory 108. For example, the graphics processor 102 may: a) execute a BLIT operation resulting in the transfer of a surface from system memory to video memory 108; b) execute a copy instruction resulting in reading a surface in system memory and writing it into video memory 108; c) execute an instruction for rendering a rectangle with a texture stored in system memory and storing the processed rectangle in video memory 108. Note that these are but a few examples; other techniques to copy data from system memory to video memory 108 may also be used by the graphics processor 102. Regardless of the particular technique used, what is important to realize is that, in each case, the graphics processor 102 is busy transferring data from system memory to video memory 108, rather than performing its principal task of processing an image or a primitive. On the contrary, when the graphics system 100 includes a copy engine 114, the graphics processor 102 and the copy engine 114 may operate in parallel, the graphics processor 102 processing data while the copy engine 114 is copying data to video memory 108.
For the purpose of clarifying the standard functionality of a copy engine within a graphics system, FIG. 4 is a flowchart illustrating an example of the prior art memory copy process implemented by copy engine 114. Note that, in this example, the copy engine 114 is copying data from the system memory to the video memory 108; however, a similar process is implemented by the copy engine 114 when copying data from video memory 108 to system memory. At step 402, the copy engine 114 receives a copy instruction including a range of system memory to be copied to video memory 108. Next, the copy engine 114 reads data from the specified range into an input buffer, whenever the external bus 112 and the memory ring bus 104 are free and available for use by the copy engine 114, at step 404. At step 406, the copy engine 114 transfers the read data from the input buffer to an output buffer, for transmission on the memory ring bus 104 to the memory controller 106 for storage in the video memory 108, whenever the memory ring bus 104 is free. At step 408, the copy engine 114 checks whether the entire range of system memory specified in the copy instruction has been copied. If so, the copy engine 114 awaits receipt of another copy instruction. If not, the copy engine 114 continues reading data from the specified range of system memory at step 404.
Prior art solutions for preventing pirate attacks on secure data are typically based on some form of cryptographic protection of the data and/or of the graphics system itself. In one such solution, data is stored in the video memory in an encrypted form so that it is unreadable to rogue devices and applications. While this prevents the data from being read, it also requires that the data be continually maintained in an encrypted form. If the graphics system wishes to process the data, it must decrypt on read, process and re-encrypt on every write back to the video memory. This leads to the impractical and undesirable scenario where several decryptor/encryptor pairs within the graphics system have to operate simultaneously at very high data rates.
Another prior art solution is described by Glenn F. Evans in U.S. Pat. No. 7,065,651, issued Jun. 20, 2006. Evans discloses that data intended for use by a video card is selectively encrypted such that anytime the data is provided onto a bus between the video card and the computer system, the data is encrypted. Video memory is divided into protected and unprotected portions, where a respective pair of encryption/decryption keys is associated with each protected memory portion. When encrypted data is received onto the video card, the data is automatically decrypted with a decryption key associated with a protected memory portion into which the decrypted data is written. The GPU of the video card can then freely operate upon the decrypted data. If the data is to be moved to an unprotected portion of video memory or to memory remote from the video card, the data is encrypted with an associated encryption key before being moved. Evans also discloses variations in terms of the level of security afforded by the solution. For example, a tamper detection mechanism may be added to the video card, so that there is awareness when data has been altered in some fashion, while contents of overlay surfaces and/or command buffers may be encrypted. Furthermore, the GPU may be enabled to operate on encrypted content, all the while preventing its availability to untrusted parties, devices or software.
In the case of the prior art solution taught by Evans, the memory controller of the video card is fundamental to the operation of the video card, since it manages the memory on the video card. However, this memory controller is also critical to the success of the cryptographic protection scheme, since it implements the primary decryption functionality of the video card, decrypting received encrypted data into protected portions of the video memory and ensuring that any data transfers on the video card take place in a manner that ensures the protection of the unencrypted data. In another embodiment described by Evans, the memory controller enforces memory protection by controlling access to the protected portions of the video memory via an access control list, while it is the GPU that implements the decryption functionality of the cryptographic protection scheme.
Unfortunately, in all the prior art implementations described above, key components of the graphics system, notably the memory controller and the GPU, are modified and/or used in order to implement the decryption functionality of the cryptographic protection scheme(s). Thus, resource usage within the graphics system must be shared between the normal, desired graphics operations of the graphics system and the functionality designed to prevent pirate attacks on the secure data being processed by the graphics system. Obviously, this leads not only to a more complicated graphics system, but also to a deterioration of the performance and speed of the graphics system.
Consequently, there exists a need in the industry to provide an improved method and system for cryptographically securing a graphics system in order to prevent pirating of secure data.