1. Technical Field
The present application relates generally to data prefetching. More particularly, the present application relates to a method, system, and computer program product for prefetching data using programmable object identification.
2. Description of the Related Art
A central processing unit (CPU) cache is a cache used by the CPU of a computer to reduce the average time to access memory. The cache is a smaller, faster memory which stores copies of the data from the most frequently used main memory locations. As long as most memory accesses are to cached memory locations, the average latency of memory accesses will be closer to the cache latency than to the latency of main memory.
When a processor wishes to read or write a location in main memory, the processor first checks whether that memory location is in the cache. This is accomplished by comparing the address of the memory location to all tags in the cache that might contain that address. When the processor finds that the memory location is in the cache it is referred to as a cache hit. When the processor cannot find that the memory location is in the cache it is referred to as a cache miss. In the case of a cache hit, the processor immediately reads or writes the data in the cache line. The proportion of accesses that result in a cache hit is known as the hit rate, and is a measure of the effectiveness of the cache.
In the case of a cache miss, most caches allocate a new entry, which comprises the tag just missed and a copy of the data from memory. The reference can then be applied to the new entry just as in the case of a cache hit. Misses are comparatively slow because they require the data to be transferred from main memory. This transfer incurs a delay since main memory is much slower than cache memory, and also incurs the overhead for recording the new data in the cache before it is delivered to the processor.
Once a cache is full, data must be removed from the cache in order to make room for newer data. The most common method for choosing which data to remove from the cache is to track when a particular block of data was last used and remove the least recently used block from the cache. Therefore, if data has not been used recently, it is unlikely to be in the cache, and will have to be loaded from main memory before it can be accessed by the CPU. This can lead to loss of performance while the CPU waits for the data to be retrieved from memory.
One solution to this problem is to anticipate what data will be needed in the near future and prefetch that data into cache. There are two commonly used methods to determine which data to prefetch:                sequential read, and        touch instructions.        
In using a sequential read, data is read either immediately before or after the last data accessed. Sequential reading is effective at prefetching data with good spatial locality, but is of no use when accessing data at random locations.
For each execution of a touch instruction, a random number in the range [0, size-1] is added to the address operand of the touch instruction to generate a virtual address. The page portion of the address makes up the next element in the page-reference string for the process. When a touch instruction generates a page reference for an invalid page, the interpreter must allocate a frame to the process. Furthermore, touch instructions are generally advisory. Instructions that are advisory are instructions that are optional, which means that the advisory instructions may not be executed if the central processing unit is busy. Therefore, touch instructions may not work if the hardware is busy. Additionally, with touch instructions when an Object A refers to an Object B, which then Object B refers to an Object C, each of these objects are usually in different layers of the software stack. As a result, instructions in a layer of a software stack working on Object A, would not be aware of Object C and any other objects hierarchically below Object C because Object C is not immediately below Object A, as Object C is immediately below Object B. Such a circumstance implies that a given layer of software stack can at most prefetch objects that are immediately below it in the hierarchy through touch instructions. That is, only Object B may be prefetched by Object A, since Object B is the only object immediately below Object A.