Protecting secure data stored or used by the processors of a data processing system is of critical importance in many data processing applications. Encryption algorithms are typically applied to secure data to render it unintelligible without application of a decryption algorithm, and secure data is typically stored in mass storage and other non-volatile storage media in an encrypted format, requiring decryption to be performed before the secure data can be read and/or manipulated by a processor in a data processing system. However, in many instances the decryption of encrypted secure data results in the secure data being stored in an unencrypted form in various types of volatile memory in a data processing system, e.g., within a main memory or within various levels of cache memories that are used to accelerate accesses to frequently-used data. Any time that data is stored in an unsecured form in any memory of a data processing system, however, that data may be subject to unauthorized access, potentially compromising the confidential nature of the data.
Encrypting and decrypting data, however, typically requires some amount of processing overhead, and as such, even in applications where secure data is being processed, it is also desirable to retain other, non-secure data in a data processing system so that processing of that other data is not subject to the same processing overhead associated with encryption and decryption.
In addition, as semiconductor technology continues to inch closer to practical limitations in terms of increases in clock speed, architects are increasingly focusing on parallelism in processor architectures to obtain performance improvements. At the chip level, multiple processing cores are often disposed on the same chip, functioning in much the same manner as separate processor chips, or to some extent, as completely separate computers. In addition, even within cores, parallelism is employed through the use of multiple execution units that are specialized to handle certain types of operations. Pipelining is also employed in many instances so that certain operations that may take multiple clock cycles to perform are broken up into stages, enabling other operations to be started prior to completion of earlier operations. Multithreading is also employed to enable multiple instruction streams to be processed in parallel, enabling more overall work to performed in any given clock cycle.
Due to this increased parallelism, the challenges of maintaining secure data in a data processing system are more significant than in prior, non-parallel data processing systems. In a data processing system that only includes a single processor with a single thread, for example, secure data may be stored in an encrypted form outside of the processor, and decrypted as necessary by that single thread once the data is loaded into the processor. When additional threads, and even additional processing cores are disposed on the same processor chip, however, it may be necessary to limit access to secure data to only certain threads or processing cores on the chip. Thus, for example, if multiple threads or processing cores share a common cache memory, storing any secure data in an unencrypted form in that cache memory may present a risk that an unauthorized party may obtain access to that data via a thread or processing core other than that which is authorized to access the secure data. Furthermore, as modern system on chip (SOC) processor designs grow to hundreds of processing cores on a processor chip, it becomes increasingly important to protect unencrypted data from even other processes on the same processor chip.
Furthermore, even from the standpoint of individual threads in a given processor or processing core, a risk may exist that secure data may be compromised as a result of virtualization. Virtualization may be used at different levels of a data processing system to support the concurrent execution of multiple user processes or applications. A processor hosting a single operating system, for example, may support the concurrent execution of multiple processes in a single operating environment, and may perform context switches to switch between the different processes at relatively frequent intervals such that the multiple processes appear to run in parallel. During a context switch, the internal architected state, or “context,” of a processor when executing one process is stored and a previously-stored state for another process is loaded into the processor so that when the processor begins to execute the other process, the internal architected state of the processor is the same as it was when a context switch was made away from that other process.
Likewise, when a processor hosts multiple operating systems within multiple virtual machines or operating environments, a hypervisor may transition between these different virtual operating environments using a process that is similar to a context switch, and as such, the term “context switch” is used hereinafter to include not only context switches performed by an operating system, but also hypervisor-initiated transitions between virtual operating environments, or any other instances where the internal architected state of a processor is temporarily saved and later restored such that program code executing when the internal state of the processor is saved can be resumed when that state is restored as if execution of the program code had never been interrupted.
When a processor transitions between different contexts or virtual machines, however, a risk exists that some data and portions of the architected state may be left behind from a previous context or virtual machine. For example, where a hypervisor controls a data processing system and manages different operating systems running under virtual machines there may be a danger that one operating system could access data or other state information from the previously-executed virtual machine. Conventional cache invalidate instructions, as just one example, invalidate a cache line in a cache by setting an invalidate bit, and otherwise leave the data in the invalidated cache line intact until a new cache line is loaded into the same physical storage. A subsequent operating system could therefore potentially access debug control registers and access the data left in a cache by a prior operating system.
While this risk is generally not a particularly great concern for many applications, in some high security applications the risk that data and/or architected state information associated with one context or virtual machine may be accessed after a context switch precludes the use of some virtualization techniques in those applications. In many government applications, for example, virtual machines may not be permitted as a result of this risk, and it is believed that this risk could be even greater in cloud computing applications where processes owned by completely different entities are virtualized to execute on the same physical hardware.
Therefore, a significant need continues to exist in the art for a manner of securing data and architected state information utilized by multiple processes running on a processor or processing core.