When computer systems were comprised of a mainframe central processing unit (CPU) and a number of dumb terminals, data file protection consisted of protecting against unauthorized access to the CPU, since all sensitive information resided in CPU memory. With the introduction of the personal computer (PC), a migration to local computing through the use of centralized host/server systems began. Again, the conventional wisdom was that sensitive information could be protected by guarding against unauthorized access to the host/server system.
Both desktop and laptop PCs over the past few years have rapidly increased their computing power, and have rapidly increased their local storage capacity due to the falling cost per megabyte of hard disk memory. The mobility of PCs through use of cellular as well as cable networks, the shift from centralized host/server systems to distributed systems, and the interconnection of LANs (local area networks), WANs (wide area networks), and the Internet have further exacerbated the problem of protecting sensitive information in such a decentralized environment.
The most widely accepted method of protecting information stored in a computer system or communicated over networks is the use of data encryption. Data encryption technology is basically classified into two technology types: symmetric or asymmetric. An example of a symmetric encryption algorithim is provided in the Data Encryption Standard, FIPS PUB 46-2; DATA ENCRYPTION STANDARD (DES), Dec. 30, 1993. The RSA encryption technology named for its inventors, Rivest, Shamir, and Adleman, is an example of asymmetric or public key encryption.
Symmetric encryption uses the same key to both encrypt and decrypt an information file. Asymmetric encryption uses two keys which share a relationship such that information encrypted with one key can be decrypted only with the second key. Symmetric encryption is much faster than asymmetric encryption, and is therefore better suited for bulk encryption of data files.
Encryption algorithms are characterized as being either reversible or irreversible. Symmetric and asymmetric encryption algorithms are reversible. A reversible algorithm is one where data is recoverable from its encrypted state back to its cleartext state. An example of an irreversible algorithm is the secure hash algorithm as defined in FIPS PUB 180-1, SECURE HASH STANDARD (SHS), Apr. 17, 1995. Secure hash algorithms were originally used to detect alterations to an information file, whether intentional or unintentional. It is not surprising, therefore, that the output of the algorithm is called a message integrity code (MIC) or message digest (MD). Other characteristics of hash algorithms are that the output is always the same binary length regardless of the size of the input. Thus, an input having a large binary length may be mapped to an output having a shorter binary length. Further, if only one bit in a message or file is changed, approximately 50% of the bits in the output change. There is no known relationship between the input and output of a hash algorithm which may be used to recover the input from the output Thus, even brute force trial-and-error attacks become prohibitive in time and cost.
Encryption keys may in addition be classified as deterministic or non-deterministic. A deterministic encryption key is one which is repeatable each time a specific input is applied to the encryption key generator. Different inputs produce different outputs. A non-deterministic encryption key is one which cannot be repeated with a same input to the encryption key generator. For example, a random number generator provides a non-deterministic result.
File encryption methods and systems are disclosed in U.S. Pat. Nos. 5,421,006; 5,065,429; 5,309,516 and 5,495,533. U.S. Pat. No. 5,421,006 discloses the use of an integrity verification system, but does not disclose the generation of a substantially irreversible and deterministic encryption key, the use of many-to-few bit mapping, or the recovery of constant value headers. U.S. Pat. No. 5,065,429 does not disclose the generation of a substantially irreversible and deterministic encryption key, a message integrity code, or the use of many-to-few mapping of bits. U.S. Pat. No. 5,309,516 does not employ file headers, provide for the checking of the integrity of the encrypted files and headers, or use a many-to-few bit mapping in its key generation to frustrate brute force attacks. U.S. Pat. No. 5,495,533 discloses the use of file headers, file trailers, and a message authentication check field in the header to protect against any modifications to the header fields. The patent does not disclose the use of the file header in the generation of an encryption key, the generation of a deterministic but non-predictable symmetric encryption key, or the use of file trailers at the end of an encrypted message file to authenticate the encrypted message file header.
General information related to file encryption techniques may be found in "Applied Cryptography", by Bruce Schneier, John Wiley & Sons, Inc., 1996; and "Cryptography: A New Dimension In Computer Data Security", by Meyer and Matyas, John Wiley & Sons, Inc., 1982.
In the present invention, a constant value associated with an information file and a secret E-Key Seed are used to generate a deterministic but non-predictable, pseudo-random, symmetric encryption key which obviates the need for key directories or other key records to recover the key. The key generation method which is used employs two many-to-few bit mappings to make the encryption key highly resistive to brute force trial and error attacks, and a secure hash function which produces a message digest of constant binary length (no matter the binary length of the input) to defeat any attempt to discover the inputs necessary to regenerate the key. The information file thereafter is encrypted with the deterministic encryption key which is destroyed upon use, and the encrypted information file and constant value are concatenated to place the constant value in the header at the beginning of the encrypted information file. The concatenation is operated upon by a secure hash function to produce a message integrity code (MIC). The MIC, a redundant constant value, and a redundant checksum are stored as trailers to the encrypted information file, and are used to verify the integrity of the encrypted information file and file header, and to recover a constant value in the event the encrypted information file has been corrupted.