Field
Embodiments of the present invention generally relate to computer security. More particularly, embodiments of the present invention relate to efficient execution of cryptographic operations by selectively using both hardware acceleration modules (cryptographic accelerators) and software running on the host central processing unit (CPU).
Description of the Related Art
There are many computing tasks that are heavy when run concurrently, and require sharing of resources in order to be completed in an efficient way. Also, on many occasions, such tasks are of a different nature, e.g., cryptographic related tasks, authentication related tasks, network processing related tasks, and graphic rendering related tasks, among others. However, resource sharing for optimal use of computing resources is mandatory in order to execute the tasks efficiently. Tasks, e.g., cryptographic operations are generally lengthy and complex, and if they are run alone on a CPU, the use of CPU resources will not be optimal.
Cryptographic operations are essential for data security purposes to protect data, files, documents, and the like while they are stored on a hard drive or removable media or in transit through one or more public networks. There are various algorithms, including, but not limited to, Rivest-Shamir-Adleman (RSA), Advanced Encryption Standard (AES), Message Digest 5 (MD5), Secure Hash Algorithm (SHA) Diffie-Hellman (DH), RC5, Blowfish and International Data Encryption Algorithm (IDEA) according to which cryptographic operations may be performed.
Traditionally, cryptographic operations have been performed completely in software or are completely offloaded to one or more hardware acceleration modules. A typical general purpose CPU is not adequate to perform both its own tasks as well as cryptographic tasks concurrently. To improve the performance, hardware accelerators are commonly used to offload cryptographic operations from the CPU. Existing solutions enable offloading of tasks from a host CPU to one or more hardware acceleration modules via a system bus; however, such offloading incurs delays due to memory transfers and other communication overhead. Furthermore, dedicated hardware accelerators for cryptographic computation are high latency devices. In order to fully utilize the computational resources, all of the hardware acceleration modules in these devices must be fully occupied. Additionally, such offloading requires the CPU to regularly poll the hardware acceleration modules to both determine the availability of resources to perform additional cryptographic operations and determine the availability of results. As those of ordinary skill in the art will appreciate, when all hardware acceleration modules are busy, CPU cycles are simply being wasted as a result of this polling.