The need for sending messages secretly has led to the development of the art and science of encryption, which has been used for millennia. Excellent surveys of the field and the state of the art can be found in Menezes et al. (Handbook of Applied Cryptography, CRC Press, (1996)), Stallings (Cryptography & Network Security: Principles & Practice, Prentice Hall, (1998)), and Stinson (Cryptography: Theory and Practice, CRC Press, (1995)).
The hallmark of a perfect cryptosystem is the fundamental property called Perfect Secrecy. Informally, this property means that for every input data stream, the probability of yielding any given output data stream is the same, and independent of the input. Consequently, there is no statistical information in the output data stream or ciphertext, about the identity and distribution of the input data or plaintext.
The problem of attaining Perfect Secrecy was originally formalized by C. Shannon in 1949 (see Shannon (Communication Theory of Secrecy Systems, Bell System Technical Journal, 28:656-715, (1949))) who has shown that if a cryptosystem possesses Perfect Secrecy, then the length of the secret key must be at least as large as the plaintext. This restriction makes Perfect Secrecy impractical in real-life cryptosystems. An example of a system providing Perfect Secrecy is the Vernam One-time pad.
Two related open problems in the fields of data encoding and cryptography are:
1. Optimizing the Output Probabilities
There are numerous schemes which have been devised for data compression/encoding. The problem of obtaining arbitrary encodings of the output symbols has been studied by researchers for at least five decades. Many encoding algorithms (such as those of Huffman, Fano, Shannon, the arithmetic coding and others) have been developed using different statistical and structure models (e.g. dictionary structures, higher-order statistical models and others). They are all intended to compress data, but their major drawback is that they cannot control the probabilities of the output symbols. A survey of the field is found in Hankerson et al. (Introduction to Information Theory and Data Compression, CRC Press, (1998)), Sayood (Introduction to Data Compression, Morgan Kaufmann, 2nd. edition, (2000)), and Witten et al. (Managing Gigabytes: Compressing and Indexing Documents and Images, Morgan Kaufmann, 2nd. edition, (1999)).
Previous schemes have a drawback, namely that once the data compression/encoding method has been specified, the user loses control of the contents and the statistical properties of the compressed/encoded file. In other words, the statistical properties of the output compressed file is outside of the control of the user—they take on their values as a consequence of the statistical properties of the uncompressed file and the data compression method in question.
A problem that has been open for many decades (see Hankerson et al. (Introduction to Information Theory and Data Compression, CRC Press, (1998) pp.75-79)), which will be referred to herein as the Distribution Optimizing Data Compression (or “DODC”) problem, (or in a more general context of not just compressing the plaintext, it will be referred to herein as the Distribution Optimizing Data Encoding (or “DODE”) problem), consists of devising a compression scheme, which when applied on a data file, compresses the file and simultaneously makes the file appear to be random noise. The formal definition of this problem is found in Appendix A. If the input alphabet is binary, the input probability of ‘0’ and ‘1’ could be arbitrary, and is fully dictated by the plaintext. The problem of the user specifying the output probability of ‘0’ and ‘1’ in the compressed file has been considered an open problem. Indeed, if the user could specify the stringent constraint that the output probabilities of ‘0’ and ‘1’ be arbitrarily close to 0.5, the consequences are very far-reaching, resulting in the erasure of statistical information.
2. Achieving Statistical Perfect Secrecy
The problem of erasing the statistical distribution from the input data stream and therefore the output data stream, has fundamental significance in cryptographic applications. It is well known that any good cryptosystem should generate an output that has random characteristics (see Menezes et al. (Handbook of Applied Cryptography, CRC Press, (1996)), Stallings (Cryptography & Network Security: Principles & Practice, Prentice Hall, (1998)), Stinson (Cryptography: Theory and Practice, CRC Press, (1995)), and Shannon (Communication Theory of Secrecy Systems, Bell System Technical Journal, 28:656-715, (1949))).
A fundamental goal in cryptography is to attain Perfect Secrecy (see Stinson, Cryptography: Theory and Practice, CRC Press, (1995))).
Developing a pragmatic encoding system that satisfies this property is an open problem that has been unsolved for many decades. Shannon (see Menezes (Handbook of Applied Cryptography, CRC Press, (1996)), Stallings Cryptography & Network Security: Principles & Practice, Prentice Hall, (1998)), Stinson (Cryptography: Theory and Practice, CRC Press, (1995)), and Shannon (Communication Theory of Secrecy Systems, Bell System Technical Journal, 28:656-715, (1949))) showed that if a cryptosystem possesses Perfect Secrecy, then the length of the secret key must be at least as large as the Plaintext. This makes the development of a realistic perfect secrecy cryptosystem impractical, such as demonstrated by the Vernam One-time Pad.
Consider a system in which X=x[1] . . . x[M] is the plaintext data, stream, where each x[k] is drawn from a plaintext alphabet, S={s1, . . . sm}, and Y=y[1] . . . y[R] is the ciphertext data stream, where each y[k]εA of cardinality r.
Informally speaking, a system (including cryptosystems, compression systems, and in general, encoding systems) is said to possess Statistical Perfect Secrecy if all its contiguous output sequences of length k are equally likely, for all values of k, independent of X. Thus, a scheme that removes all statistical properties of the input stream also has the property of Statistical Perfect Secrecy. A system possessing Statistical Perfect Secrecy maximizes the entropy of the output computed on a symbol-wise basis.
More formally, a system is said to possess Statistical Perfect Secrecy if for every input X there exists some integer j0≧0 and an arbitrarily small positive real number δ0 such that for all
      j    >          j      0        ,            Pr      ⁡              [                                            y                              j                +                1                                      ⁢                                                  ⁢            …            ⁢                                                  ⁢                          y                              j                +                k                                              |          χ                ]              =                  1                  r          k                    ⁢                          ±                        δ          0                ⁢                                  ⁢        for        ⁢                                  ⁢        all        ⁢                                  ⁢        k              ,      0    <    k    <          R      -                        j          0                .            
A system possessing this property is correctly said to display Statistical Perfect Secrecy. This is because for all practical purposes and for all finite-lengthened subsequences, statistically speaking, the system behaves as if it indeed, possessed the stronger property of Perfect Secrecy.
It is interesting to note that Statistical Perfect Secrecy is related to the concept of Perfect Secrecy. However, since the property of Statistical Perfect Secrecy can characterize any system and not just a cryptosystem, there is no requirement of relating the size of the key to the size of the input, as required by Shannon's theorem.