Non-volatile memory (e.g., Flash) is often organized in partitions (or logical units), which are ranges of logical block addresses (LBAs). Some memory systems identify certain partitions as high priority. When the memory system receives a host command targeted to a high-priority partition, the memory system can provide preferred execution ordering relative to commands targeting other partitions. For example, after the memory system identifies a command as being targeted to a high-priority partition, the memory system can place the command at the head of its execution pipeline by marking the command as the next command to be served. If the command to the high-priority partition is received while the memory system is executing another command, the memory system can allow the execution of the prior command to complete or can interrupt the execution of the prior command in favor of the high-priority command.
A high-priority partition is sometimes used to store code that is executed by a processor. For example, in a system-on-chip (SoC) device, the high-priority partition can be used to store executable code for a peripheral (e.g., a modem) on the chip. Some applications require a very low latency in reading the code (e.g., less than 2 ms). In such situations, an application processor in the SoC can load parts of the code it will need to execute from the high-priority partition to volatile memory (e.g., DRAM) without execution delays. However, if the application processor needs to execute a part of the code that is not cached in DRAM (e.g., if a “page fault” occurs), the application processor would need to retrieve the code it needs from the non-volatile memory, which may result in not meeting the low latency read requirement. To avoid this possibility and guarantee low read latency, the entirety of the code can be cached in DRAM. However, DRAM is a much more expensive resource than non-volatile memory, so keeping the entire code in DRAM consumes a costly resource.