Cloud computing allows clients to outsource their computations to untrusted cloud service providers. Ensuring privacy of code and data while executing software on a computer physically owned and maintained by an untrusted party is challenging. A potential attacker may have physical access to the datacenter making it vulnerable to physical attacks, such as probing of the memory bus.
A common solution is to reduce the attack surface by minimizing the trusted computing base (TCB) to a secure processor and a small portion of the client's application. INTEL SOFTWARE GUARD EXTENSIONS (SGX) is the latest hardware support for building trusted computing systems. INTEL SGX provides hardware primitives for this purpose. An SGX-enabled secure processor seeks to isolate code and data of private enclave functions in an application from the rest of the system, including its own public functions, system software, and hardware peripherals.
An enclave is a secure container that contains both private data and the code that operates on the private data. The application is responsible for specifying the parameters of the enclaves and invokes the enclaves through special CPU instructions. When an enclave is invoked, the untrusted system software loads the enclave contents to the portion of the protected memory allocated for the enclave's execution. The secure processor computes the enclave's measurement hash over initial data and code, which the remote client uses for software attestation. Thereafter, the enclave is executed in a protected mode, hardware checks ensure that every memory access to protected memory is from its enclave.
A significant challenge with secure processors such as SGX is providing defenses against memory bus side channel attacks and cold reboot attacks. While the secure processor is trusted, the memory and the memory bus are not. A conventional secure processor guarantees confidentiality by encrypting data before sending it to memory. In addition, by storing hash message authentication code (HMAC) along with the encrypted data in memory, a secure processor checks the integrity and freshness of data when it is read back. To guarantee the freshness of data, an adversary must be prevented from rolling back the state of a memory block by recording and replaying older packets (either by manipulating values in memory or while transmitted over the bus). To defeat such replay attacks, a conventional secure processor uses Merkle trees to maintain the current versions of memory blocks and verify that read responses return the latest versions. However, Merkle trees impose severe memory space and bandwidth requirements.
These solutions, however, do not prevent an adversary from observing memory addresses, access types (e.g. read/write), trace length, and access times by probing the memory bus. Just by observing memory addresses, researchers have shown that an adversary can infer an execution's control flow, and thereby infer sensitive program inputs and cryptographic keys. Defending against memory bus side channel attacks requires solutions to at least three problems: data and address confidentiality, data integrity and freshness, and timing channel leaks.
To protect confidentiality of addresses (also, access types and write sets), prior solutions employ expensive oblivious RAM (ORAM) solutions. To obfuscate the address pattern, depending on the memory size, an ORAM access may require one to two orders more of memory accesses compared to a normal dynamic random-access memory (DRAM) access. Recent hardware innovations have made significant improvements to bring down the performance cost to about a factor of 4 times that of non ORAM memory. However, the reduction in performance cost comes with a significant increase in hardware complexity and space overhead.
In addition, ORAM does not protect either memory access times or the total number of memory accesses from leaking. To protect this information from leaking, a technique called memory-trace obliviousness (MTO) may be performed. To guarantee MTO, the number and type of instructions executed, as well as their execution time, must be independent of all sensitive inputs to a program. This technique requires a deterministic compiler and a hardware solution that prohibits almost all commonly used optimizations (e.g. caches, instruction re-ordering, speculation, etc.). Also, the input program needs to obey non-trivial constrains. For example, loop guards need to be independent of sensitive input.
Innovations in the field of 3D integration have led to the rise of 3D-DRAM devices such as the HYBRID MEMORY CUBE (HMC). A typical 3D-DRAM consists of several layers of DRAM dies stacked on top of each other, with a logic layer at the bottom, all internally connected using through-silicon vias. It is almost impossible to physically probe the through-silicon vias without destroying the 3D package. The layers of DRAM are partitioned vertically into vaults. Each vault consists of several DRAM banks and vaults can be accessed in parallel. A 3D-DRAM device is connected to a host processor through a conventional memory bus, such as a Serializer/Deserializer (SerDes) link. Unlike traditional DRAM's double data rate (DDR) interface with low-level commands, a 3D-DRAM device is exposed through a more flexible packet interface.
The background description provided here is for the purpose of generally presenting the context of the disclosure. Work of the presently named inventors, to the extent it is described in this background section, as well as aspects of the description that may not otherwise qualify as prior art at the time of filing, are neither expressly nor impliedly admitted as prior art against the present disclosure.