Cryptographic operations such as an Advanced Encryption Standard (AES) operation are typically implemented in software for execution on generic processor hardware. Many processors include datapaths of fixed widths such as 64, 86, or 128 bits. Given limited hardware and instruction support for cryptographic operations, is difficult to efficiently perform such operations on existing processors.
Further, processor floorplans have a wide X dimension and a critical Y dimension with a high aspect ratio. Any increase in the Y dimension adds to the growth of the overall chip. The allocated Y budget is very small and thus there is a need to find a minimal area solution at a good performance for the round operations. The performance has latency and throughput considerations; some modes of the AES algorithm are serial in nature where latency of the operations is an issue, whereas others are parallelizable and throughput is more of an issue. Furthermore, splitting key generation across the dual execution pipes involves many bits of information that must cross back and forth between the pipes, which implies large buses that add to the critical height of the chip.