1. Field of the Invention
The present invention generally, relates to computing systems, and more specifically, to performing, in a computing system, software operations that are interleaved with hardware-assisted instructions. One embodiment of the invention provides a computing system where externally originated requests are processed by a general-purpose processor with virtual memory (VM) support and local accelerator circuits (such as encryption engines).
2. Background Art
In modern VM environments, with multiple layers of indirection between physical DRAM chips and virtual addresses, the required VM operations are among the most expensive ones, needed both before and after each hardware operation, mainly due to cache management. Hardware, including Direct Memory Access (DMA) devices, addresses the same memory at bus-level addresses. The need to (repeatedly) synchronize VM and DMA references is usually a known performance-limiting factor.
In a VM environment, addresses perceived by software are mapped indirectly to hardware addresses. If software operations are interleaved with hardware-assisted instructions, cached contents must be repeatedly synchronized with SDRAM contents (inhibiting caching), and one must map between hardware and virtual addresses repeatedly. Mapping and synchronization between virtual and hardware addresses stress VM mechanisms, which is a recognized problem. Zero-copy operations, transforming software to facilitate faster migration between virtual and hardware addressing, is one solution, however this may require significant application-visible adjustment.
An unusual property of embedded cryptographic systems (such as hardware security modules, HSMs) is that requests may usually be separated to fixed (bounded) size headers and variable-length payload. Headers need to be visible to software. In many cases, however, payload only passes through a number of hardware-accelerated operations (such as en/decryption or hashing), then returned to the external user. In such a system, if headers are sufficiently descriptive, internal software may indirectly steer the required transfer operations (such as request specific parts of payload to be DMAed to an accelerator chip), while the actual payload need not be mapped through VM.