1. Field of the Invention
The present invention relates to processor architectures and, in particular, to a data processing circuit and to a method of transferring data, in which the safety against external attacks for spying out data is increased.
2. Description of Related Art
Cryptographic algorithms are generally characterized by the fact that safety-relevant data is processed. Such safety-relevant data is, for example, a private key in an asymmetric cryptography algorithm, such as, for example, the RSA algorithm. The private key is used to decrypt data having been encrypted by a corresponding public key. Alternatively, the private key is used to process a digital signature by the pertaining public key for purposes of authentification.
Such processors, however, do not only process data using private or secret keys but typically also include data relating to persons which has to be protected from attacks, such as, for example, personal data or the balance when a payment card is considered. The PIN of an ec payment card of course also belongs to such secret data which is absolutely to be protected from external attacks to obtain an acceptance of such a cryptographic system in the market.
A special field in which cryptographic algorithms are increasingly employed are chip cards or safety ICs. In particular in chip cards, another requirement is that the space for a chip card processor system is limited. The chip area available, which is usually predetermined, must be utilized to the best degree possible to accommodate a calculating unit and a working memory and a non-volatile memory on the one hand and to accommodate the periphery elements belonging to a cryptography process system, such as, for example, a crypto-coprocessor, a random number generator, an input/output port etc., on the other hand.
Well-known attacks to cryptographic systems are the so-called power analysis attacks. Since cryptography processors are typically realized in CMOS technology, such circuits have a strongly inhomogeneous power consumption when no special counter-measures are taken. As it is well known, CMOS circuits hardly consume any power at all when states on a bus or in a calculating unit do not change. If, however, the states in a calculating unit or on a bus change, a current which has to be fed in by a power source will flow during switching a CMOS circuit from one state to another state. This is in particular true for bus driver circuits which, in particular when the data buses are long, apart from the actual power consumption the CMOS circuit has, also have to provide a current for reloading power capacities which in such long buses can take up considerable values.
In addition, long number calculating units are employed for cryptography processor for reasons of safety on the one hand and for performance reasons on the other hand. Such long number calculating units sometimes comprise a data width of, for example, more than 1024 or—in recent times —more than 2048 bits. Such a long number calculating unit includes a corresponding number of bit slices, wherein a bit slice, apart from the actual arithmetic unit usually including at least one full adder function, also has register cells for several registers required for executing a cryptographic operation, such as, for example, a modular multiplication.
In DE 3631992 C1, a long number calculating unit including, as a central element, a long number 3 operands adder for executing a modular exponentiation required for the RSA algorithm is disclosed. The modular exponentiation is divided into a plurality of modular multiplications which, in turn, are divided into a plurality of 3-operands additions. Using a multiplication look ahead algorithm and a reduction look ahead algorithm coupled thereto, a 3-operands operation results in which an intermediate result, the multiplicand and the modulus, possible multiplied by shift values and look ahead parameters, are added to yield a new intermediate result.
Within a bit slice, there is a so-called slice internal bus connecting the register positions within the bit slice and the slice calculating unit to one another. The bit slices of the calculating unit are connected to one another via a calculating unit internal bus which usually only has a data width of for example eight bits and to the other elements of the cryptography data processing system for example via an external bus.
Considering the fact that a long number calculating unit comprises very many bit slices, this calculating unit internal bus running outside the bits slices is a very long data bus having a length of several millimeters and which can be recognized on the integrated circuit as a very regular structure. The same applies to the long number calculating unit itself comprising one or several stacks of bit slices.
Considering the fact that in typical safety ICs the chip area itself is limited and, in addition, the power consumption also plays a role which is particularly considerable when contactless applications are considered, in which the chip card itself has no power supply of its own but gets its power from the surrounding HF field, requirements that chip area is saved and the power consumption is to be kept low result for the calculating unit internal bus on the one hand and the bit slices on the other hand.
On the other hand, in safety ICs there are requirements that measures against external attacks, such as, for example, power attacks, of which the simple power attacks (SPA) or the differential power attacks (DPA) are the best known members, must be taken. Without such measures, an attacker could trace each switching process on, for example, the calculating unit bus or a slice internal bus by a power analysis and then would only have to find out the original state or intermediate data state in order to be able to record all the data processed in order to be able to determine secret data, such as, for example, secret keys, PINs, balance amounts etc. knowing the algorithm executed and other marginal conditions.
A method optimal regarding the safety is to no longer form each data bus—relating to a bit line—as a single data line but as two data lines. This so-called dual rail technology is based on the fact that at a certain time complementary states are transmitted on the two data lines. If, on a first dual rail line, there is a voltage state, for a certain time, representing a logic “1”, the complementary state is present on the second dual rail line, that is, with this example, a voltage state corresponding to a logic “0”. Thus, the safety is already increased in that at each switching from one state to another, both lines switch so that it can no longer be found out by a power analysis in which direction a switch has been performed since the two switching directions always take place simultaneously.
Although a safety increase has already been obtained, it can nevertheless be recognized by means of the power analysis whether switches have been performed in subsequent cycles or not. If there are, for example, five subsequent logic “1” states, no power consumption can be recognized in the power characteristic so that an attacker can still obtain the information that nothing has changed in the data on the dual rail bus in these five cycles.
In order to eliminate this safety leak as well, the so-called dual rail technology with precharge is used. A so-called precharge clock is fed in between each data clock. In this precharge clock, both the first dual rail line and the second dual rail line are brought to a logically high state so that a single switching will always be recognizable in the current profile, that is when it is proceeded from a data clock to a precharge clock, or when it is proceeded from a precharge clock to a data clock, and irrespective of whether the data change from one clock to the next.
Although the dual rail technology with precharge has provided a maximum safety, this is, however, paid for by a maximum expenditure. Because each bit line has to be formed twice, the dual rail technology leads to double the chip area consumption for the transmission buses. Since, additionally, a precharge clock is introduced after each data clock, this technology also leads to a processing speed half as large since no payload data can be processed in the precharge clocks.
Since, in addition, two data lines must be reloaded and thus two line drivers—instead of one line driver in single rail—exist, the power consumption is twice as large. The maximum safety thus has a high price, that is a chip area consumption twice as large, a payload data throughput half as large and double a power consumption.
For these reasons, the dual rail technology with precharge, in spite of the superior safety provided against power attacks, is usually not employed in safety ICs.
Typically, alternative solutions are employed, such as, for example, dummy calculations for disguising the power profile, software technological algorithms which—irrespective of the data processed—require the same number of cycles, etc. It is common to all those measures that they do not provide a maximum safety against more complex attack algorithms and require intervention in routines already existing, resulting in the fact that extended tests etc. must be performed for the routines, respectively, so that on the one hand the cost increases and on the other hand the time in which a new product can be launched on the market increases. Additionally to certain safety requirements, those two topics are decisive for whether a cryptography processor chip can gain acceptance on the very competitive market or not.