In general, a cache is a place for temporarily storing a thing so that the thing can be accessed more quickly and/or efficiently than is possible where the thing is normally stored. Within the context of computer systems, a cache is usually memory used for temporarily storing data so that the data can be accessed more quickly and/or efficiently than is possible in the memory in which the data normally resides. Thus, volatile memory is typically used to cache data that resides on non-volatile memory, such as magnetic disks. Similarly, fast volatile memory is typically used to cache data that resides in slower volatile memory. It is common for computer systems to use multiple levels of cache, where each level of cache is typically smaller and faster than the preceding level.
Caches may be implemented within a processor itself. The processor could then access the cache memory much faster than off-processor cache memory because the address and data bus used to access the cache memory was implemented on the processor die itself. The internal cache memory was called the level one (L1) cache. An L1 cache is a relatively small amount of static RAM (SRAM) memory used as a cache that is integrated or packaged within the same module as the processor. It is clocked at the same speed of the processor. L1 cache is used to temporarily store instructions and data, making sure the processor has a steady supply of data to process while the RAM catches up delivering new data. Data stored in the L1 cache can be used by the processor at no cost in clock cycles.
Referring to FIG. 1, most processors are currently supplied with two levels of caches. The L1 102, or primary cache, is integrated within the processor core itself 101 and is thus the fastest. The next level of cache 103, called L2 or secondary cache, is situated outside the core 101 and usually runs at a lower clock speed than the core 101, though it may also run as fast as the core 101 itself. However, even if the L2 cache 103 runs at the same clock speed as the core 101, the L2 cache 103 will be slower than the L1 cache 102 because it is not part of the core 101 itself. The size of the L2 cache 103 is always much larger than the L1 cache 102 and, if it runs fast enough, its throughput can come close to that of the L1 cache 102. Therefore, the L2 cache 103 plays a very important role in maintaining a high memory throughput to the processor.
Data pre-fetch allows the processor to “look-ahead” and fetch data from slow memory (such as a hard disk) before it is needed by the processor. This results in fewer processor pipeline stalls, and higher overall performance in many applications. The L2 cache utilizes data pre-fetch where the program data handler fetches data sequentially from the slow memory and places the data in the L2 cache. As the processor executes code from the L1 cache, if the code that it needs is not in the L1 cache (called a cache miss), the processor fetches the data from the L2 cache. The cache miss consumes some time and slows the processor's operations. The amount of time wasted is greater if the data is not in the L2 cache and the processor has to access slow memory to fetch the data. This occurs when the execution of the code is not sequential, e.g., when multi-layer function calls or long jumps occur.
Some application programs, such as database management applications, attempt to solve the cache miss problem by allocating L1 cache memory for certain operations and attempting to retain the allocated memory for the duration of the application's operation. This approach works well if there is an unlimited amount of L1 memory, which is impractical. If the L1 cache becomes fully allocated and the processor requires L1 memory space, the processor will take back some of the allocated memory. This approach also results in many cache misses after full allocation of the L1 cache, thereby slowing the application's operations.
Application programs have the ability to manipulate the data stored in the L1 cache. If an application can keep a large portion of its working data in the L1 cache throughout the execution of the application, then it can improve its overall performance. A good example of an application program that would benefit from having data remain in L1 cache is a database management system.
Referring to FIG. 2, database queries and data manipulation language (DML) commands are used to access and manipulate data in a database application. A user creates queries and DML commands 201 which are compiled 203 into an object called a cursor 202. A cursor contains a query plan 204 which describes how a query is executed. A user creates a cursor 202 to provide a prepackaged query or set of queries of the database that can be executed by other users.
FIG. 3 shows a processor 301 allocating L1 cache memory 302 to running applications via a cache memory manager 306. The cache memory manager 306 allocates L1 cache memory 302 to processes from the scratch memory space 305 (free memory space) in the L1 cache memory 302. As processes execute, the scratch memory 305 expands and contracts as the cache memory manager 306 allocates and frees memory.
As users access a database using queries, portions of L1 cache are allocated for storing the cursors constructed for those queries. Specifically, the cache memory manager 306 allocates L1 memory 302 to the “current set” 303 of cursors. The current set of cursors includes the cursors that are currently loaded into L1 memory. As the queries associated with the current set 303 are executed, they require a memory space to perform operations on the database. This memory space is called a working set 304. The cache memory manager 306 allocates the working set 304 from the L1 memory 302 for the current set 303. As a user uses the database application, cursors in the current set change. Consequently, the amount of L1 cache memory that is allocated to the database server fluctuates.
Referring to FIG. 4, a working set 401 contains a frame buffer 402 and a bind buffer 403. The frame buffer 402 and the bind buffer 403 are the two main components of the runtime memory used for the execution of a query. The frame buffer 402 is used to store the results of the query. The bind buffer 403 is used to store in-binds which are the user input variables and out-binds which are the variable values returned by the database. The database reads the in-binds from the bind buffer 403. Output variables values are stored in the bind buffer 403 by the database as out-binds.
The working sets for the entire set of queries used by an application will typically not fit into the L1 cache. This means that cache misses will frequently occur as users execute queries on the database.
Some database applications try to reduce cache misses by not freeing the L1 cache memory that a current set obtains during runtime. This approach works well in a system with an infinite amount of L1 cache memory—which is not feasible. When the L1 cache is filled and space is needed for the working set of a new query, the system writes over one of the working sets in the L1 cache. This leads to many cache misses, because L1 memory space is taken by queries that are infrequently executed and/or by queries that are stale. This results in slower response times because the system has to access the cache miss queries from slower memory.
Based on the foregoing, there is a clear need for a system that provides for the management of L1 cache memory allocation that allows an application program to run in an efficient manner by reducing cache misses. Additionally, the system would intelligently provide an application program with an adequate supply of L1 cache memory without needlessly allocating L1 cache memory because of unusual application program behavior.