This invention relates to methods of encrypting data and systems for storing and retrieving encrypted data in particular a method of creating and storing keys for use in encrypting data.
It is known to encrypt data using various encryption algorithms such as the block cipher known as Advanced Encryption Standard (AES). Such algorithms apply a series of steps to the data to encrypt them using a key to determine the manner of the steps or the order in which the steps are undertaken. With a symmetric key system the same key is then used to decrypt the data. It is also known to have asymmetric key (public-key) systems whereby a different key is used to encrypt and to decrypt.
On a basic level a key could be used, for instance, to refer to a series of different substitutions ciphers. A substitution cipher is where each letter in the original unencrypted text, often called ‘plaintext’ is substituted for a different letter or number in a pre-determined manner. It is possible to use several different substitution ciphers. If each different substitution cipher is given a letter as an identifier, a key word of several letters can therefore refer to which cipher should be used in which order on the plaintext to create encrypted text, often known as ‘ciphertext’. With such a system part of the security of the encryption will be in having secrecy of the substitution ciphers as well as the key itself. Today it has become common to use publicly available algorithms such as AES whereby the security is entirely in keeping the key secret since the algorithm itself is widely available. AES is frequently used due to compliance with Federal Information Processing Standards
Provided the key is kept secret such algorithms can be effective in encrypting one particular large set of data. There are various methods to attempt to decrypt the data without the knowledge of the key, sometimes called a “hack” or “cryptanalysis attack.” One method is a so called brute force attack whereby each possible key is tried. Provided the key is sufficiently long then the power needed to perform such a brute force attack makes it unworkable. For this reason it is common to use AES with at least a 128 bit key.
However, all such systems suffer from a difficulty that once the key is known all of the data can be decrypted. It is particularly problematic with symmetric key systems if several people are required to encrypt data since all of these people have access to the key which can also be used to decrypt. In asymmetric key systems then the encrypting parties will not have knowledge of the whole of the decrypting key but any compromises in security of knowledge of the decrypting key will lead to the system being very vulnerable.
Additionally where such systems are used to repeatedly encrypt relatively small pieces of data using the same algorithm and key, the system can be compromised by the similarity in changes from plaintext and ciphertext. Data in relational databases is often created, entered and stored at the so called “atomic level” rather than a macroscopic level and in data warehouses creating and storing data at the most elemental level possible is often considered an important design philosophy to maximise flexibility. To maintain flexibility data tends to also be encrypted at the “atomic level” so that small pieces of data are individually and independently encrypted.
For example, a database can be created with a large number of customers. Each new customer will require a new database entry to be created. When created the encryption algorithm using the key is then applied. If the same data is entered in the same location and the same key is used, this will produce the same result as in a previous entry, and produce the same entry in the ciphertext. This can allow for third parties attacking the security system to look for these repeated patterns in an attempt to break the encryption. If a third party is in a position of being able to supply new customer details for encryption then every time they place a chosen word such as Jones in a particular place, such as the surname column, in a new entry then new entries will be made in the equivalent ciphertext and the third party can find a repeated pattern of Jones and from this find every entry for a person named Jones placed into the system previously. This can be used as a first line of attack to decrypt the whole system or could be used to just gain specific limited information.
For example in a database of health records each persons medical history is deemed highly confidential. By finding the encrypted form or entering the data to find the encrypted form for a particular common first name, surname and health condition it may be possible to find that particular confidential information.