Steganography basically relates to a field of technology in which attempts are made of hiding secrete messages in other messages such that a non-authorized person will not suspect at all that the message presented to him contains a secret message hidden therein. Differently from cryptography, i.e. the encrypting of messages, it is thus possible to obtain efficient protection for secret messages, as a non-authorized person will not suspect at all that a message contains a secret message. In contrast thereto, it can easily be noticed with encrypted messages that these are encrypted. There are many techniques to “break” encryptions. There is agreement in technology that messages encrypted in arbitrary manner can be decrypted with arbitrary expenditure. Thus, the endeavors in cryptography concentrate in particular on making the expenditure for a non-authorized decipherer as high as possible, such that, deterred by the high expenditure, he will refrain from non-authorized decrypting of the encrypted messages. However, under specific circumstances, an expenditure of any degree will be accepted in order to be able to decrypt especially important messages. It is assumed that there are more intelligent, but less complex, methods of “breaking” for many of the known methods of encryption. Such efficient “breaking” cannot be excluded for any of the methods known so far. Steganography is a supplementation in this respect. Steganography—steganography originally means hidden writing—tries to hide secret information in a message in such a manner that nobody will suspect at all that secret information is already hidden therein. In this event, not even the highest expenditure will be of assistance, since a non-authorized person will not know at all which message contains a secret message, especially when he is supposed to monitor large quantities of messages.
Most recently, there has been a great demand for steganographic techniques, as the use of “email” has found ever increasing use, with the applications being no longer in the military field only. In particular, there is a need in companies to electronically transmit information that is to be kept secret. It is self-evident that no unauthorized person should have access to such secret business data by tapping a data line, which e.g. may be part of the Internet. Thus, there is a multiplicity of mail programs encrypting a text prior to mailing thereof. However, as has already been pointed out, there is no safe encryption.
This is why modern steganographic concepts have come into existence most recently. One of these steganographic concepts consists in using, in image files, the last bit or least significant bit of pixels for storing the information to be hidden. Such methods are described in detail by Joshua R. Smith et al., “Modulation and Information Hiding in Images”, First International Workshop, Cambridge, UK, May 30 to Jun. 1, 1996, pp. 207–225. Although large amounts of secret information can be hidden in images, this method involves the disadvantage that image files in general are very large files, so that transmission thereof via electronic mail takes a relatively long time. Furthermore, frequent transmission of very large files between a common sender and a common receiver is relatively conspicuous, which is contrary to the steganographic idea as such.
Known methods for hiding information in texts consist in that specific simple predefined sentence structures can be generated, with the grammatical composition of a specific sentence reflecting usually binary information to be hidden. These methods are described in detail by Peter Wayner, “Disappearing Cryptography”, Academic Press Inc., 1996, pp. 91–121. Such predefined grammars have the disadvantage that a sender and a receiver, if they desire to communicate secret information frequently, permanently send texts having substantially the same contents or slightly modified meaning contents only, giving rise to the suspicion that secret information is hidden therein.
Known methods of hiding information in texts thus utilize either predefined grammars, which either can generate only simple predefined sentence structures, or are based solely on the alteration of the control characters, space signs and tabulators. Both methods are relatively conspicuous, can be used to a very limited extent only, produce a small bandwidth only, i.e. the amount of information that can be hidden in a specific text is relatively small, and they are not robust with respect to minor changes, such as e.g. reformatting of the text or slight reformulation thereof. Such methods thus are relatively unsuited also for hand-written notes or passages in print media.
In particular, there is a need to distribute secret information to one or more receivers via a newspaper article. Thus, it would be particularly conspicuous if a passage in the newspaper suddenly contained a predefined grammar that becomes conspicuous solely by its contents, unless the grammar accidentally has been matched to the current events of the day.
The technical publication “Techniques for data hiding”, W. Bender et al., IBM Systems Journal, vol. 35, Nos. 3 and 4, 1996, pp. 313–336, describes various steganographic concepts. Among other things, possibilities of hiding data in a text are shown, comprising a method for hiding information via manipulation of unused space on the printed page, a syntactic method using e.g. the punctuation marks for hiding information, and a semantic method making use of a manipulation of the words themselves for hiding information. In the semantic method, two synonyms have a primary value and a secondary value allocated thereto. In case of many synonyms, there may be coded more than one bit per synonym. It is deemed problematic in this respect that the desire to hide as much information as possible may collide with the still existing meaning differences between the synonyms. In the syntactic method, the diction and structure of texts is altered, without substantially altering the meaning and mode of speech, respectively. This is achieved in that, if there is a grammatical structure comprising a main clause and a subordinate clause, an information bit is hidden in the text by arranging the subordinate clause in front of the main clause, or arranging the subordinate clause after the main clause. It is deemed problematic in this method that the possibilities of hiding information are limited.
EP 0 268 367 relates to an interface for natural language, which is used for ensuring the semantic correctness of an inquiry. To this end, a text to be analyzed is input into a dictionary analyzer connected to a dictionary of synonyms in order to carry out a synonym substitution, so as to be able to transfer an as small as possible standard vocabulary to a parser connected downstream of the dictionary analyzer. The parser is connected furthermore to a main dictionary and a grammar stage for performing a syntactic analysis of the text input that possibly contains synonyms.
The output of the parser is fed to a simplification stage which has the effect of increasing a recall or a number of hits or a number of the documents delivered back from an inquiry. The simplified inquiry in turn is fed to a translation stage coupled with a database management system capable of producing an output that can function as an interface for a user.
U.S. Pat. No. 5,424,947 relates to a device and a method for analyzing natural language and the construction of a knowledge data base for natural language analysis. A sentence is syntactically analyzed by a parser in order to provide the phrase structure thereof, inclusive of an existing ambiguity. The phrase structure is fed to a dependency analyzer producing on the output side a dependency structure without ambiguity. To this end, a knowledge database is accessed which comprises dependency/taxonym/synonym data and context dependency data. The dependency structure without ambiguity is fed to a system for automatic processing of natural language texts, such as e.g. a machine translation system.