Cryptography is directed to secure communication techniques in the presence of third parties, known as adversaries. More generally, cryptography includes constructing and analyzing protocols that block adversaries to ensure data confidentiality, data integrity, authentication, and non-repudiation.
Completeness results in cryptography provide general transformations from arbitrary functionalities described in a particular computational model, to solutions for executing the functionality securely within a desired adversarial model. Certain previous results modeled computation as Boolean circuits, and showed how to emulate the circuit securely gate by gate.
As the complexity of modern computing tasks scales at tremendous rates, it has become clear that the circuit model is not appropriate. In particular, converting “lightweight” optimized programs first into a circuit in order to obtain security is not a viable option. Large effort has recently been focused on enabling direct support of functionalities modeled as Turing machines or random-access machines (RAM). This approach avoids several sources of expensive overhead in converting modern programs into circuit representations. However, it actually introduces a different dimension of inefficiency. RAM (and single-tape Turing) machines do not support parallelism. Thus, even if an insecure program can be heavily parallelized, its secure version will be inherently sequential.
Modern computing architectures are better captured by the notion of a Parallel RAM (PRAM). In the PRAM model of computation, several (polynomially many) CPUs are simultaneously running, accessing the same shared “external” memory. It should be noted that PRAM Central Processing Units (CPUs) can model physical processors within a single multicore system, as well as distinct computing entities within a distributed computing environment.
A machine is said to be memory oblivious, or simply oblivious, if the sequences of memory accesses made by the machine on two inputs with the same running time are identically (or close to identically) distributed. It has been previously shown that a Turing machine can be compiled into an oblivious one with only a logarithmic slowdown in running-time. Roughly ten years later, the notion of Oblivious RAM (ORAM) was proposed, and showed a similar transformation result with polylogarithmic slowdown. In recent years, ORAM compilers have become a central tool in developing cryptography for RAM programs, and a great deal of research has gone toward improving both the asymptotic and concrete efficiency of ORAM compilers. However, for all such compilers, the resulting program is inherently sequential.
ORAM lies at the base of a wide range of cryptographic applications such that parallelism within the corresponding secure application is desired. Hiding correlated lookups while maintaining efficiency is perhaps the core challenge in building oblivious RAMs. In order to bypass this problem, ORAM compilers may heavily depend on the ability of the CPU to move data around, and to update its secret state after each memory access. However, in the parallel setting, having all processors attempt to perform a lookup directly within a standard ORAM construction corresponds to running the ORAM several times without moving data or updating state, which immediately breaks security in all existing ORAM compiler constructions. Furthermore, most cannot afford for the CPUs to take turns accessing and updating the data sequentially.
Therefore, there is a need to formulate cryptographic primitives that directly support PRAM computations while ensuring that secret information is not leaked via the memory access patterns of the resulting program execution.