1. Field of the Invention
The present invention relates to arrangements for the protection of documents against forgery or repudiation. The invention also relates to arrangements for the protection of electronically transmitted messages against forgery or repudiation.
2. State of the Art
It is common nowadays to provide security to documents through the use of holograms, watermarks, personal signature, notary stamps and other physical means: these all increase the difficulty for making unauthorised imitations or changes; however, they all require physical inspection, often involving forensic equipment and expertise, in order to detect a counterfeit. It is also becoming increasingly necessary to provide security for electronically transmitted messages.
The present invention provides for the security of the text of a document or message by cryptographic techniques.
In accordance with the present invention, there is provided an apparatus which is arranged to process a selected part or selected parts of the text of a document or message to form a hash, the hash usually being of fewer characters than the selected part or parts of the text, the processing comprising retrieving numerical values which define the respective characters of the selected part or parts of the text and making a calculation using the numerical values of the successive characters.
The apparatus may be arranged to receive or create a text in electronic form, then process this text to derive the hash of the selected part or parts of the text. The apparatus may further be arranged to add the hash to the text: typically, the apparatus then outputs the text, with the added hash, either for printing as a document or for electronic transmission. Alternatively the apparatus may be arranged to output the text and the hash separately (or store one and output the other).
The practical value of the hash is that it is sensitive to any change or alteration in the selected part of the text from which it is derived: it is not feasible to make a desired alteration to that part of the text whilst preserving the same hash value.
The hash thus forms a cryptographic signature which makes forgery detectable on the basis of an assessment of the content of the text and without the need for any forensic examination of the document.
The hash algorithm is not applied to the whole text, only to a selected part, or to selected parts. The or each part is identified, or sealed, by predetermined characters or combinations or characters immediately preceding and immediately following it: for example, a series of tilde marks (xcx9c) may be used.
Preferably the numerical values of the respective characters of the selected text are their ASCII values: the characters preferably include all keystrokes (including space, return etc.); preferably the xe2x80x9calphabetxe2x80x9d is restricted to all keystrokes having ASCII values in the range 32 to 125 inclusive and also including ASCII values for the xe2x80x9creturnxe2x80x9d.
Preferably the processing is recursive, in that the calculation in respect of each character uses the result of the calculation made in respect of at least one previous character.
Preferably the calculations for the first several (e.g. 10) characters use successive ones of a set of initial variables: preferably the calculations for each subsequent character uses, instead of an initial variable, the result of the calculation in respect of a previous character.
Preferably each calculation also uses one of a predetermined set of prime numbers. Preferably each calculation uses an interim result to determine which of these prime numbers is used to complete the calculation.
Preferably the processing involves at least a second pass over the selected part or parts of the text: in other words, once the calculation for the last character is completed, a second series of successive calculations is carried out on the characters, typically starting with the first character, and using the results of the calculations of the first series.
At the end of the above-described processing, the hash is formed by taking selected digits from the results obtained in a final plurality of the calculations: for example the final two digits may be taken from each of the final 10 results, and a 20-digit hash formed by placing these 10 pairs of digits in a given order.
One form of hash algorithm used in the invention is an Objective Linguistic Hash (OLH). This is linguistic in that it xe2x80x9creadsxe2x80x9d letters, numbers and other keys commonly used in the preparation of documents. It is objective in that the hash value produced can be verified by anyone using the algorithm. The OLH algorithm produces a final number by acting recursively one character at a time throughout the length of the message.
The variability of the message far exceeds the variability of the final hash, so inevitably many different messages would have the same hash value. However, it is unfeasible to make a meaningful change to the message whilst retaining the same hash number.
It will be appreciated that the invention may be incorporated in a word processing apparatus. In this use, a document is created in electronic form on the apparatus, complete with the seal (e.g. series of tilde marks) at the beginning and end of the or each selected part of the text. A xe2x80x9csealingxe2x80x9d command is then performed, whereupon the apparatus automatically processes the xe2x80x9csealedxe2x80x9d part or parts of the text to create the hash, which is stored with the text. Subsequently, the document can be altered or corrected as necessary, then xe2x80x9cre-sealedxe2x80x9d, to process the sealed part or parts of the text again and create the hash afresh. Once the document is finalised, it can be printed out, complete with the hash.
The above-mentioned OLH algorithm may be modified to provide a Subjective Linguistic Hash (SLH). This differs from the OLH in that it is made subjective by being xe2x80x9cseededxe2x80x9d with secret information known only to an accredited authority: thus, the processing of the selected or xe2x80x9csealedxe2x80x9d part or parts of the text is carried out using secret initial variables. Preferably use is made of a seed, in the form of a very large secret number (typically having 50 to 200 digits) known as the Secret Primitive (SP). An algorithm is run, using the SP, to produce the initial variables: preferably this algorithm also uses a number of items of open information, known as Open Primitives (OP""s), contained in the document or message being protected. The SLH algorithm may produce a plain hash initially, then encrypt this using the SP as secret key: this preserves the secrecy of the plain hash.
A further algorithm which can be used in accordance with the invention is a Subjective Encrypted Hash (SEH) algorithm. This involves encrypting an OLH hash, using secret primitive values known only to a witnessing party, together with open primitive values such as date and time. In this case, the witnessing party uses an apparatus into which the OLH of a document or message is keyed, together with the open primitive values, and which encrypts the OLH using the SEH algorithm, to create the SEH hash which is preferably printed on the document, or on a label for application to the document. Preferably the apparatus stores the initial OLH and the final SEH, together with the open primitive values.