Many high-speed hardware implementations of cryptographic algorithms use pipelining and/or unrolling to speed up cryptographic processing. But, while pipelining and/or unrolling certain cryptographic algorithms make for an easier-to-route, higher-performance hardware core with a small area, it often makes the interface timing very restrictive. Also, the input and output words of these methods need to interface to the hardware core within a fixed time. This inflexibility makes these hardware cores very difficult to use, and in some cases, results in a larger system than if discrete hardware cores for each individual encryption operation had been used. Another limitation of many high-speed hardware implementations of cryptographic algorithms is the lack of scalability of these implementations as higher throughputs, different speed grades, and/or different target devices are required.
Commonly-assigned U.S. patent application Ser. No. 12/650,248, which is hereby incorporated by reference herein in its entirety, describes a hardware implementation of an encryption core using pipelined registers to perform block encryption processing. The hardware core is pipelined to N levels rather than a single cycle. In addition, the hardware core can support N simultaneous encryption operations using comparable logic resources as those used when a separate core encrypts one block at a time.