A cache has proven to be an effective way to combine the advantages of fast, but expensive, memory with the affordability of slower memory to achieve the most effective memory system. A cache is architecturally located between the CPU and main memory and consists of a cache memory and an associated cache controller. As requests for memory access most often affect only a relatively small portion of the main memory, the most frequently addressed data can also be held in the cache. The advantage of this is a much reduced access time, which produces a considerable increase in overall system speed. The data and code that are not required as frequently can be stored in the slower main memory without appreciably affecting program execution time.
When the CPU requires data, it sends out a read request. However, the cache controller intercepts the request from the CPU before the request reaches the main memory and determines whether the requested data is available instead in the cache memory. If so, a “cache hit” is said to have occurred, and the data is read from the cache memory into the CPU. If not, a “cache miss” is said to have occurred, and the data request is forwarded to the main memory for fulfillment. (The requested data and associated address are often updated in the cache.) The instructions directing such cache operations are provided by the CPU via a bus connecting the cache between the CPU and the main memory.
Many currently available computer systems come with two levels of cache. A first level cache (“L1 cache”) is architecturally nearer the CPU, and is faster than a second level cache (“L2 cache”). The L2 cache is usually somewhat larger and slower than the L1 cache, although to a diminishing degree with recent technology.
When the CPU-needs to execute an instruction, it looks first in its own data registers. If the needed data isn't there, the CPU looks to the L1 cache and then to the L2 cache. If the data isn't in any cache, the CPU accesses the main memory. When the CPU finds data in one of its cache locations, it's called a “hit,” whereas a failure to find the data is a “miss.” Every miss introduces a delay, or latency, as the CPU tries a slower level. For high-end processors, it can take from one to three clock cycles to fetch information from the L1 cache, while the CPU waits. It can take 6–12 cycles to get data from an L2 cache on the CPU chip, and dozens or even hundreds of cycles for off-CPU L2 caches.
As mentioned above, the controller for a conventional external cache communicates with the CPU via a processor bus. In this manner, instructions and data are sent by the CPU, interpreted by the cache controller and appropriately executed by the cache controller. However, the cache and cache controller are designed for a specific CPU and a specific processor bus, such that only a specific CPU may instruct the cache controller via only a specific bus and bus interfaces. Those skilled in the art recognize the disadvantages and inefficiencies of such application- and device-specific caches. For example, because prior art secondary caches are tailored to specific processors, a control bus separate from a standard bus is required to route external cache control instructions from the processor to the cache to control its operation. Moreover, in order for the CPU to read and write data to memory, and for the cache to cache data and fulfill the read and write requests in accordance, the CPU must generate external cache control instructions.
Accordingly, what is needed in the art is an external cache that overcomes the disadvantages of the prior art discussed above.