1. Field
The present disclosure relates to computer processors (also commonly referred to as CPUs).
2. State of the Art
A computer processor (and the program which it executes) needs places to put data for later reference. A computer processor design will typically have many such places, each with its own trade off of capacity, speed of access, and cost. Usually these are arranged in a hierarchal manner referred to as the memory system of the processor, with small, fast, costly places used for short lived and/or frequently used small data and large, slow and cheap places used for what doesn't fit in the small, fast, costly places. The memory system typically includes the following components arranged in order of decreasing speed of access:                register file or other form of fast operand storage;        one or more levels of cache memory (one or more levels of the cache memory can be integrated with the processor (on-chip cache) or separate from the processor (off-chip cache);        main memory (typically implemented by DRAM memory and/or NVRAM memory and/or ROM);        controller card memory; and        on-line mass storage (typically implemented by one or more hard disk drives).        
In many computer processors, the main memory of the memory system can take several hundred cycles to access. The cache memory, which is much smaller and more expensive but with faster access as compared to the main memory, is used to keep copies of data that resides in the main memory. If a reference finds the desired data in the cache (a cache hit) it can access it in a few cycles instead of several hundred when it doesn't (a cache miss). Because a program typically cannot do anything while waiting to access data in memory, using a cache and making sure that desired data is copied into the cache can provide significant improvements in performance.
A large part of the memory traffic of an executing program stems from memory accesses to the stack frame of the currently executing function, or, via pointer arguments, to a few of the immediately surrounding frames. Of this traffic, many accesses are initializations of frame local variables with zero.
Because of the high frequency of access, the referenced memory of the current stack frame tends to be resident in the top level data cache. When the current function exits, its entire stack frame is invalid and the corresponding lines are meaningless. Because the invalidated lines have been written to, the cache has marked them as dirty and subject to write-back in a write-back cache structure, even though the values contained are meaningless.
If a line not in any cache is written to in a write-allocate cache design then the cache experiences a write miss, which causes the rest of the line to be read in from memory, be merged with the written data in a write buffer, and the result copied into the cache. Reads caused by write-misses consume power and memory bandwidth, and write buffers are expensive resources.
While the frame of an exited function is no longer valid in the program, the memory it had occupied still resides in the cache and can be read by an accidental or contrived wild address. This permits browsing in the detritus of called functions, a potential source of insecurity and exploits.
It is not uncommon for a program to contain a bug by which it will read and use a value that has never been initialized. The read of the initialized value will often be from stack frame locations of previously exited functions. In this case, the read receives the most recent value that happened to reside at the read address, which may vary from run to run of the program. The resulting failures tend to be difficult to reproduce and debug.