1. Technical Field
The present invention relates generally to the field of encrypted file storage and more particularly to a general purpose distributed encrypted file system using centralized keystores and extended attributes.
2. Description of the Related Art
In an enterprise environment having a centralized encrypted file system (EFS), it is desirable to provide distributed access to the EFS to all clients connected to the server hosting the EFS. Generally, a distributed EFS may be implemented using either a stackable EFS or an embedded EFS.
A stackable EFS can be mounted on top of any file system to provide distributed access to the EFS across a network. Stackable EFS provides strong encryption using a cipher-block chaining (CBC) cryptographic mode. Cryptographic modes are used in conjunction with cryptographic algorithms such as AES (Advanced Encryption Standard). Cryptographic modes, referred to simply as modes from this point forward, greatly increase the overall strength of the cryptographic algorithm.
In CBC, the file is encrypted block-wise starting with an initialization vector (IV), which may be a 128-bit number. For simplification of this description, assume a 128-bit encryption key. The encryption process starts with an XOR of the IV and the encryption key. The product of the XOR is used to encrypt the first 128-bits of plane text. The IV may be a well known or agreed upon number such as zero, or to strengthen the encryption, the IV maybe a random number or the hash of the previous block. However, if the IV is not discoverable or out of sync with the encrypted data, it will not be possible to decrypt the data.
The plane-text file encrypted sequentially, 128 bits at a time. Each 128-bit block of plane-text is encrypted and then XORed with the encryption key. The product of the XOR is then used as new key to encrypt the next 128-bit of plane-text. CBC mode essentially provides each 128 bits of plain-text with its own unique encryption key. However, the decryption of any 128-bit block relies on the successful and completely accurate decryption of the previous 128-bit block. If the preceding 128-bits cannot be accurately to the bit be decrypted, then all remaining bits in the file or until the next IV reset cannot be decrypted.
The initialization vector and encryption keys for each block of plane-text are stored as crypto metadata. As shown in FIG. 3A, each data file block is associated with crypto metadata, which may comprise a hash block. In the event of a network or server outage between the time of a data block write and its corresponding crypto metadata block write, data sitting on the disk will be lost through strong encryption and cannot be recovered. This is because the “chaining” of the cipher-block chaining will become out of sync, as illustrated in FIG. 3B, wherein an outage occurred between the writing of data block n (new) and its associated new hash block n. Not only will the particular write instance which was lost during the network or server outage be lost, but all remaining data in the file become lost through strong encryption because CBC algorithm can never be re-synchronized.
This data loss problem is particularly acute in EFS environments. For performance reasons, EFS systems will reset the IVs throughout the file. Large files are logically divided into data blocks, each with synchronized with its own IV. This means that when a file is edited and data is replaced, the entire remaining file does not need to be re-encrypted to accommodate the chaining requirements of CBC. Only the data within the logical data block up to the IV reset needs to be re-encrypted. This solves the performance problem of not having to re-encrypt the entire file, but multiples the synchronization requirement of ensuring the IV is aligned with the data block.
The data loss problem associated with stackable EFS can be avoided by using embedded EFS technology. However, embedded EFS cannot simply be mounted on any native file system. Instead, the entire file system must be modified substantially.