Protecting secure data stored or used by the processors of a data processing system is of critical importance in many data processing applications. Encryption algorithms are typically applied to secure data to render it unintelligible without application of a decryption algorithm, and secure data is typically stored in mass storage and other non-volatile storage media in an encrypted format, requiring decryption to be performed before the secure data can be read and/or manipulated by a processor in a data processing system. However, in many instances the decryption of encrypted secure data results in the secure data being stored in an unencrypted form in various types of volatile memory in a data processing system, e.g., within a main memory or within various levels of cache memories that are used to accelerate accesses to frequently-used data. Any time that data is stored in an unsecured form in any memory of a data processing system, however, that data may be subject to unauthorized access, potentially compromising the confidential nature of the data.
Encrypting and decrypting data, however, typically requires some amount of processing overhead, and as such, even in applications where secure data is being processed, it is also desirable to retain other, non-secure data in a data processing system so that processing of that other data is not subject to the same processing overhead associated with encryption and decryption.
In addition, as semiconductor technology continues to inch closer to practical limitations in terms of increases in clock speed, architects are increasingly focusing on parallelism in processor architectures to obtain performance improvements. At the chip level, multiple processing cores are often disposed on the same chip, functioning in much the same manner as separate processor chips, or to some extent, as completely separate computers. In addition, even within cores, parallelism is employed through the use of multiple execution units that are specialized to handle certain types of operations. Pipelining is also employed in many instances so that certain operations that may take multiple clock cycles to perform are broken up into stages, enabling other operations to be started prior to completion of earlier operations. Multithreading is also employed to enable multiple instruction streams to be processed in parallel, enabling more overall work to performed in any given clock cycle.
Due to this increased parallelism, the challenges of maintaining secure data in a data processing system are more significant than in prior, non-parallel data processing systems. In a data processing system that only includes a single processor with a single thread, for example, secure data may be stored in an encrypted form outside of the processor, and decrypted as necessary by that single thread once the data is loaded into the processor. When additional threads, and even additional processing cores are disposed on the same processor chip, however, it may be necessary to limit access to secure data to only certain threads or processing cores on the chip. Thus, for example, if multiple threads or processing cores share a common cache memory, storing any secure data in an unencrypted form in that cache memory may present a risk that an unauthorized party may obtain access to that data via a thread or processing core other than that which is authorized to access the secure data. Furthermore, as modern system on chip (SOC) processor designs grow to hundreds of processing cores on a processor chip, it becomes increasingly important to protect unencrypted data from even other processes on the same processor chip.
Conventionally, encryption and decryption have been handled by software executing on a processor. Encryption and decryption, however, are processor intensive tasks, and as a result, dedicated hardware-based encryption engines have been developed to perform encryption/decryption of secure data in a faster and more efficient manner than can typically be achieved by software, thereby reducing the overhead associated with such operations. Conventional encryption engines are typically disposed external from a processing core, e.g., between the processing core and a memory controller, or otherwise coupled to a memory bus that is external from any processing core. Furthermore, to facilitate determining what data is and is not encrypted, secure data may be stored in specific memory address regions, such that filtering may be used to control an encryption engine to encrypt/decrypt only data that is stored in identified ranges of memory addresses.
Such an architecture, however, can lead to unencrypted data existing in caches and being accessible by other threads and/or processing cores in a chip. Furthermore, a memory controller, resident outside of a processor chip, is typically required to establish and manage ranges of memory addresses in which secure data is stored so that an encryption can be selectively activated for memory transactions involving secure data, resulting in inefficient throughput for secure data.
Similar challenges also exist with respect to data compression. Data compression may be used to reduce the amount of memory required to store data; however, compressed data must be decompressed prior to use by a processing core or thread. Compression and decompression of data involves processing overhead, and as such, implementing such functions in software often comes with a performance penalty. Furthermore, dedicated compression engines have also be developed to reduce the processing overhead associating with compressing and decompressing data; however, such engines are typically disposed external to a processing core, e.g., within a memory controller, and as a result, compressed data may be required to be stored in various levels of cache memory in a decompressed format, which limits the amount of room for storing other data in such cache memories, thereby reducing memory system performance. In addition, as with encrypted data, a memory controller may be required to establish and manage ranges of memory addresses in which compressed data is stored so that a compression engine can be selectively activated for memory transactions involving compressed data, resulting in inefficient throughput for compressed data.
Therefore, a significant need continues to exist in the art for a manner of minimizing the performance overhead associated with accessing and managing encrypted and/or compressed data in a data processing system, as well as providing further protection of encrypted data within a multithreaded and/or multi-core processor chip and a data processing system incorporating the same.