Data breaches are an increasing concern as more and more data is digitally stored. For example, data breaches are arguably the main deterrent for the adoption of cloud services for applications that manage sensitive, business critical information. On a public cloud, applications must guard against potentially malicious cloud administrators, malicious co-tenants, and other entities that can obtain access to data through various legal means. Since the compute and storage platform itself cannot be trusted, any data that appears in cleartext anywhere on the cloud platform (on disk, in memory or over the wire) must be considered susceptible to leakage or malicious corruption. In verticals such as finance, banking, and healthcare, compliance requirements mandate strong protection against these types of threats. However, existing security solutions such as Transparent Data Encryption (TDE) and Transport Layer Security (TLS) only protect data at rest and in transit, but leave data vulnerable during computation (data in use).
One way of addressing this issue is the use of partially homomorphic encryption (PHE) schemes. These are encryption schemes that permit a restricted class of computation directly on encrypted data. Other mechanisms, such as secure hardware, also exist that allow some computation to be performed on encrypted data. These solutions are not general purpose, however, because they do not enable all types of operations to be performed on encrypted data. Therefore, users of platforms that employ these mechanisms must analyze their applications and decide whether it is possible to encrypt some part of their data while preserving application semantics. Making such a determination can be extraordinarily difficult as each data element referenced by an application may be subject to a complex set of constraints and dependencies that can limit the type of encryption that may be applied thereto. Moreover, inaccurate determinations in this regard can result in a failure to apply the strongest possible encryption to certain data elements as well as a failure of application logic due to an inability to perform certain operations on encrypted data.
A related system that addresses the problem of data security is CryptDB (See Raluca Ada Popa et al., “CryptDB: Protecting Confidentiality with Encrypted Query Processing,” in Proceedings of the 23rd ACM Symposium on Operating Systems Principles (SOSP), Cascais, Portugal, October 2011). Generally speaking, CryptDB is a database that uses PHE schemes to execute queries against encrypted data. CryptDB requires developers to specify the strongest encryption scheme that can be applied to each database column. In the absence of any such specification, CryptDB assumes that a column can be maintained in cleartext. For reasons discussed above, determining the strongest encryption scheme that can be applied to each database column can be extremely difficult to achieve and errors in this regard can result in under-secured columns as well as failures in application code that operates on encrypted columns.