Industries such as finance, healthcare, and retail sectors generally maintain large amounts of data. Due to the volume of data maintained in these industries, electronic file systems used are typically distributed file systems that store data on multiple storage arrays located on remote storage systems separate from the application servers. Distributed electronic file systems used to maintain the data typically employ data encryption in order to protect sensitive information from unauthorized users and to comply with various regulatory requirements.
Current encryption methods used to encrypt data on electronic file systems include: application level encryption, controller-managed encryption, and self-encrypting storage drives. Application level encryption is the process of encrypting data at the application level or virtual machine level before writing the data to the storage system. One benefit to encrypting at the application level is that the data is encrypted before it is sent via the network to the storage devices. This prevents unauthorized access of data while it is sent from the application server to the storage device via the network. However, storing application level encrypted data requires a large amount of storage space because techniques such as compression and deduplication are difficult to apply. Encryption generally randomizes the data so that compression becomes ineffective on the encrypted data. Encryption also makes it less likely that two identical copies of a unit of data are identical in the encrypted form thus making deduplication of the encrypted data difficult as well.
Self-encrypting storage drives are storage devices that have the ability to encrypt data when it is received. One advantage of using self-encrypting drives is that the storage controller can reduce the size of the data through compression and deduplication before storing the data in the self-encrypting drives. However, one of the major disadvantages to using self-encrypting drives is that the data sent over the network from the application server to the self-encrypting drive is not encrypted and may be susceptible to unauthorized access while travelling over the network. Additionally, users of self-encrypting drives are limited to the encryption capability of the drive itself. If a user wishes to update or change their encryption strategy, then the user may be forced to buy new hardware that supports the changed strategy.
Controller-managed encryption is encryption that is performed by the storage controller. For example, a write request to write data is sent from the application server to the storage controller. The storage controller then determines which storage device to store the data and encrypts the data before sending it to the appropriate storage device. This technique allows users the flexibility to change and/or upgrade storage devices without worrying about the encryption limits of the storage devices, as was the case with the self-encrypting drives. However, a security risk still exists when receiving data from the application servers because the data sent from the application servers to the storage controller is not encrypted. Therefore, an efficient approach to encrypting data that protects the data during transit between application servers and storage systems and that may benefit from compression and deduplication techniques is desired.
The approaches described in this section are approaches that could be pursued, but not necessarily approaches that have been previously conceived or pursued. Therefore, unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section.