Today, conventional server architectures are designed around general purpose processors (GPPs) which serve as a single data processing engine to execute a variety of different functions. These functions include data processing functions, as well as infrastructure-related functions. For example, infrastructure related functions executed by a GPP enable the GPP to serve as an I/O controller and data hub, a server flash (cache) controller, a local storage controller, and a shared MMU (memory management unit). While server architectures implemented using GPPs have served the computing industry successfully, the use of GPPs to implement such a wide range of server functionality is problematic in terms of, e.g., efficiency and excess data movement. Indeed, not all processing tasks are executed efficiently (in terms of power, processor cycles, TCO (total cost of ownership), etc.) on a GPP. For example, the non-optimal execution of tasks on a GPP can result in the consumption of important resources such as internal buses, fabrics, memory bandwidth, processor cycles, cache, etc. With regard to data movement, a GPP must frequently move data and program code in and out of the GPP's external memory (DRAM) to process workloads for receiving and processing I/O data and executing the software stacks that support IO and storage functionality, which can unduly consume a large amount of processor cycles.