A cloud service provider stores data in the cloud, such that data storage is made available as a service via a network. For example, data logically stored in a virtual storage pool may be physically stored in a physically distributed manner across multiple servers and across multiple geographic locations. Such physical distribution promotes massive scalability based on commodity infrastructure, but leads to increased risk of unauthorized physical access to data. Under a traditional cloud service provider environment, data in the cloud is encrypted to protect against unauthorized data access. For example, encryption of data at rest protects data that is stored at a cloud service provider and encryption of data in transit protects data that is transmitted to and from the cloud service provider. As can be imagined, the vast diversity and quantity of information stored in the cloud is a valuable target for data vandals. In order to promote economic growth and user adoption of cloud services, cloud service providers must be continually vigilant to protect data from unauthorized data breaches.
Cryptographic mechanisms provide security services for electronic data storage, such as data stored in the cloud. Cryptography is the process of encrypting ordinary information, called plaintext, into unintelligible data, called cyphertext, and decrypting the cyphertext back to plaintext. An algorithm is used with auxiliary cryptographic information, such as a cryptographic key, to perform the encryption and decryption. While it may be difficult to keep an algorithm secret, a key is often easier to protect as it is typically a small piece of information that can be changed as needed to prevent security breaches. Keeping keys secret is part of the process of key management, which also includes the secure generation, exchange, storage, distribution, use, replacement, and destruction of keys. Just as a strong safe is only secure if the safe's combination is kept secret, a strong algorithm is directly dependent on secure key management to keep a cryptographic key secret.
In a traditional cryptographic service environment, a key management service provides cryptographic keys. For example, to encrypt plaintext or plain data into cyphertext or cypher data, first a master key is created and identified with a master key identifier. Next, a key can be generated from the master key to encrypt data, and the key management service can return the key both as a plain (unencrypted) key and as a cypher (encrypted) key, which was encrypted by the master key. The plain key can be used to encrypt plain data into cypher data, and the cypher data and the cypher key can be stored together and the plain key can be erased from memory. Later, to decrypt the cypher data, the cypher key can be transmitted to the key management service, which can decrypt the cypher key using the master key and return the plain key. The plain key can then be used to decrypt the cypher data, yielding the original plain data, and the plain key can be erased from memory. This traditional cryptographic service environment is advantageous because it ensures that, should there be unauthorized loss of the cypher data by a data vandal, the cypher data is useless to the vandal who does not possess the applicable key in plain form because only a cypher key in encrypted form is stored with the cypher data. In other words, with a key management system, the long-term storage of a plain key is stored in a different location than the storage of the cypher data, and users must authenticate themselves with the key management service provider to gain access to the plain key in order to decrypt the cypher data.
However, should a data breach of cypher data include access to an applicable plain key, then a vandal can decrypt the cypher data. For example, should a file such as a text document be encrypted, a vandal with the applicable plain key can decrypt the entire file and have access to the entire contents of the file. Similarly, if an entire database is encrypted, and the database consists of cells organized in columns and rows, then a security exploitation by a vandal could result in the vulnerability of all the cells in the database. On the other hand, each cell of a database could be encrypted with its own key, and a security attack by a vandal would limit the vulnerability to a single cell rather than the entire database. Thus, there is a need in the traditional cryptographic service environment to decide the granularity of the data to be protected, such as perimeter security of an entire database or interior security of, for example, tables, columns, rows, and/or cells. This decision of granularity is traditionally based on balancing competing requirements of security and performance.
In a traditional cryptographic service environment, computational performance is degraded by increased security overhead such as increased encryption operations. It is to be understood that sensitive data must be protected, or else an organization may face legal and brand consequences. Yet, if the real-world performance is impacted due to an encryption strategy, then the customer environment can be degraded, such as a restriction in the flow of authorized information, with a resulting market loss in business. In other words, an encryption solution in a traditional environment is not considered successful unless it includes acceptable performance. In a traditional balancing analysis, the encryption of data at a granular level, such as every cell of a database, is traditionally considered to have too large of a performance workload and is thus traditionally considered an unsuccessful solution.
Accordingly, traditional balancing analysis includes an encryption overhead penalty that applies to the balancing of granular security with the cost of performance workload. For example, one form of an overhead penalty that applies to performance workload is responding to a data breach. In a traditional cryptographic service environment, a response team comprising information technology, security, and forensic professionals is typically available on-call for a rapid response to a breach. Despite the response team's best effort to be swift, a response traditionally takes time due to the extensive decisions that must be collectively determined. A response could take a week or more of activities including understanding how an intruder got into the system, analyzing the scope of the breach, and determining the appropriate remediation plan. To aid in the team's investigation, the response team may collect digital clues left behind by vandals and interview staff to understand the sensitivity of the breached data and ascertain the location of such sensitive data.
With a traditional response, the response team must react to each unique breach and determine what data has been compromised. Such determination includes ascertaining the encryption granularity of the compromised data in order to create a remedial plan to re-encrypt the breached data. One traditional balancing approach to encryption granularity is to encrypt an entire database with a single key. This approach has the benefit of having a low overhead penalty for re-encryption because, after decrypting the database, a single new key can be used to re-encrypt the entire database. However, the low overhead penalty is generally considered unbalanced with respect to security because a vandal who steals the single key would have the opportunity to decrypt and access the entire database that could contain, for example, millions of user profiles.
Nevertheless, strengthening security through increasing data granularity can significantly increase the overhead penalty with respect to responding to a data breach with, for example, re-encryption of breached data. This overhead penalty is represented in part by the effort that a response team must expend in order to determine which of the breached data must be re-encrypted with new keys. Under a more balanced approach, some groups of data may be encrypted, such as encryption at the column level. For example, a column of social security numbers may be encrypted into cyphertext, while a column of cities may be left in plaintext or in clear form that is unencrypted. While initially encrypting a group of columns with respective encryption keys strengthens security with an initially low performance overhead, the manual and technical effort to determine which groups of data should be re-encrypted after a breach is significant. Furthermore, after this determination is made, the onerous task of implementing re-encryption processes to re-encrypt selected groups of data further adds to the performance overhead of data re-encryption, which increases the breach response time.
Consequently, there is a long standing technical problem in the data management arts in the form of a need to provide more effective data re-encryption procedures that address computational performance requirements to be able to scale the re-encryption of granular data.