1. Field
The present invention relates generally to cryptography and, more specifically, to using linguistic steganography in a cryptographic system.
2. Description
Cryptography is a discipline of mathematics concerned with information security and related issues, such as encryption, authentication, and access control. Early techniques for cryptography include Baudot codes and the Vernam cipher. A cryptographic system is typically used to transform clear text into cipher text and vice versa.
Steganography is the art and science of writing hidden messages in such a way that no one apart from the intended recipient knows of the existence of the message. This is in contrast to cryptography, where the existence of the message itself is not disguised, but the content is obscured. Linguistic steganography is the art of using written natural language to conceal secret messages. The idea is to hide the very existence of the message.
Several automated techniques exist to transform ciphertext into text that looks like natural language text while retaining the ability to recover the original ciphertext. This transformation changes the ciphertext so that it doesn't attract undue attention from, for example, hackers, agencies or organizations that might want to detect or censor encrypted communications. Although it may be relatively easy to generate a small sample of quality text, it is challenging to be able to generate large texts that are meaningful to a human reader and which appear innocuous.
One system for text steganography is called NICETEXT (available on the Internet at http:\www.nicetext.com\. (with “\” replacing “/”) and described in the paper (publicly available at that web site) entitled “A Practical and Effective Approach to Large-Scale Automated Linguistic Steganography,” by Mark Chapman, George Davida, and Marc Rennhard. The Chapman paper describes two methods. The first method is to use a set of grammatical rules to generate models for output text on-the-fly. The second method is to generate sentence models for output text by parsing known documents. The NICETEXT system largely ignores the grammar approach in favor of static sentence models that are automatically generated from sample text. According to the Chapman et al., paper, the static sentence model requires that the sentence structure be created from known bodies of text. From a security perspective, this is deficient. Hence, better techniques are desired.