1. Field of the Invention
The present invention refers to computer architectures and in particular to processor architectures with an arithmetic unit and a register memory.
2. Description of the Prior Art
FIG. 3 shows a known processor architecture at the example of a coprocessor. The coprocessor comprises an arithmetic unit 300, a register memory 310 and a control part 320. The arithmetic unit 300, the register memory 310 for operands for the arithmetic unit as well as the control part 320 are physically disposed on the coprocessor 340. The coprocessor is connected to an external bus 350 via an interconnecting bus 340. An external memory 360, a host CPU (not shown) as well as input/output interfaces (not shown), etc. are also connected to the external bus 350.
The arithmetic unit is connected to the register memory via an internal bus 370. The control part 320 can communicate with the register memory 310 or the arithmetic unit 300, respectively, via control lines 380a, 380b. As it is known, the arithmetic unit (AU) 300 is formed to carry out instructions on operands that are stored in the register memory 310. Therefore, the control part 320 controls the register memory 310 to load operands needed by a special instruction to be carried out by the AU 300 to the AU, so that the instruction can be carried out on the operands. The result of the arithmetic operation is written back into the register memory via the internal bus 370 to be available for a next instruction at the instigation of the control part 320 or to be brought to an external memory via the interconnecting bus 340. Typically, the register memory for a standard processor is designed such that it has a certain number of registers necessary for common calculations to be carried out by the arithmetic unit. If the processor is a general-purpose processor, certain registers of the register memory 310 will be needed according to algorithm to be calculated, while other registers that are not needed by a certain algorithm are unused.
If, however, a higher number of registers than are present in the register memory 310 is needed for a calculation, those operands having no space in the register memory 310 will be stored in external memory 360. If the arithmetic unit 300 needs data for its calculations that are not present in the register memory 310, those operands will have to be loaded from the external memory 360 via the interconnecting bus 340. In contrary to the data traffic on the internal bus 370 that happens very fast due to the configuration of the internal bus and not at least due to the short physical lengths, data traffic between the external memory 360 and the arithmetic unit 300 does need a lot of effort. This effort shows, as has been mentioned, by the longer transfer time of data due to the typically physically much greater length of the external bus and the interconnecting bus 340 as well in the signaling in order to signal an operand transfer from the external memory 360 to the arithmetic unit 300 or an operand transfer from the arithmetic unit 300 back to the external memory 360.
Especially with security relevant applications, i.e. if the coprocessor 340 is a cryptocoprocessor and is for example implemented on a chip card or is part of a security-IC, a security problem exists, when operands have to be loaded from the external memory 360 via the external bus into the arithmetic unit 300 and back. For an attacker it is easier to localize the external bus on the chip and to “tap” it, than finding the internal bus 370 and tap it. One reason for that is the typically regular sizing of the external bus 350 on the chip as well as the significantly greater length of the external bus in comparison to the internal bus 370 of the coprocessor. Especially when the coprocessor itself is implemented as integrated circuit, the physical length of the external bus 370 is very small, so that tapping this bus is almost impossible. This is totally different for an external bus 350, which has to be connected to the coprocessor chip via an I/O interface.
Regarding the sizing of the register memory 310 usually a register memory capacity is used that is not too large, since in algorithms that only need a small number of operands a large part of the register memory 310 would be unused, i.e. idle. Register memory cells are relatively space-intensive, especially if a large number of them has to be placed on a chip. To keep, for example, a cryptocoprocessor small, the number of register cells is kept small to avoid the case that register memory space is constantly unused and idle and still uses up space on the chip. For the purpose of space efficiency of the chip it is therefore deliberately accepted that for algorithms needing more operands than have space in the register memory a high number of operand transfers from the external memory 360 to the coprocessor 330 has to take place.
Especially in a chip card, where the working memory is very small anyway and due to the size limits of the chip card maybe in the range of 2 to 8 kilobyte, the register memory 310 of a peripheral element, as e.g. of a coprocessor 330, a random number generator, a hash module, a module for a symmetrical cryptography (DES, AES) of another peripheral device, is usually chosen to be very small, so that there is enough working memory (XRAM) available for the functionality of the chip card. Further it should be noted that the chip card also has to comprise a read-only memory (ROM) as well as a non-volatile writeable memory (EEPROM, flash, etc.) so that the register memory 310 is usually laid out as small as anyhow possible in order to fulfill the space requirements of the chip on the chip card.
This is however, as has been mentioned, paid for by security compromises and time losses due to the operand transfer between the external memory 360 and the arithmetic unit 300.