Memory protection is a way to control memory access rights on a computer, and is a part of most modern operating systems. The main purpose of memory protection is to prevent a process from accessing memory that has not been allocated to it. This prevents a bug or malicious code within one process from affecting other processes, or the operating system itself. Memory protection for computer security includes additional techniques such as address space layout randomization and executable space protection.
Modern processors may provide a key protection mechanism for accessing memory, wherein virtual or physical memory may be divided up into blocks of a particular size or sizes (e.g., 4 KB), each of which has an associated numerical value called a protection key. Each process also has one or more protection key values associated with it. On a memory access the hardware checks that one of the current process's protection keys match the value associated with the memory block being accessed. If not, an exception or fault occurs. The protection key or keys associated with a process may be stored or cached in protection key registers. The protection key associated with virtual or physical memory may be stored or cached in operating system storage, or page tables, or a translation lookaside buffer (TLB), etc.
Modern processors may also include instructions to provide operations that are computationally intensive, but offer a high level of data parallelism that can be exploited through an efficient implementation using various data storage devices, such as for example, single-instruction multiple-data (SIMD) vector registers. In SIMD execution, a single instruction operates on multiple data elements concurrently or simultaneously. This is typically implemented by extending the width of various resources such as registers and arithmetic logic units (ALUs), allowing them to hold or operate on multiple data elements, respectively. The central processing unit (CPU) may provide such parallel hardware to support the SIMD processing of vectors. A vector is a data structure that holds a number of consecutive data elements. A vector register of size L may contain N vector elements of size M, where N=L/M. For instance, a 64-byte vector register may be partitioned into (a) 64 vector elements, with each element holding a data item that occupies 1 byte, (b) 32 vector elements to hold data items that occupy 2 bytes (or one “word”) each, (c) 16 vector elements to hold data items that occupy 4 bytes (or one “doubleword”) each, or (d) 8 vector elements to hold data items that occupy 8 bytes (or one “quadword”) each. Examples of SIMD vector registers may also include registers of various sizes including one or more of the following sizes: 64-bits, 128-bits, 256-bits, 512-bits, etc. Some processor architectures may include various SIMD instructions for loading and/or storing multiple data elements concurrently or simultaneously from and/or to locations in memory, wherein the locations in memory may be either sequential and/or contiguous, or non-sequential and/or non-contiguous, and the order of accessing these locations in memory may vary somewhat unpredictably.
In a processor that does not have a key protection mechanism for accessing memory, there may be data structures, which users or the operating system would like to protect from being corrupted by a bug or by malicious code within some other process. Take for example, a memory-accessing mechanism known as a global descriptor table (GDT). An operating system may need to access and/or modify the GDT through use of instructions such as load global descriptor table (LGDT) register, and store global descriptor table (SGDT) register, frequently enough such that changing the access rights of the GDT from read-only to read-write for each modification and then back to read-only upon completion presents an unacceptable performance barrier.
To date, potential solutions to such performance limiting issues, microarchitectural costs for supporting a key protection mechanism, access-rights and memory-protection bottlenecks have not been adequately explored.