Processor performance has been increasing faster than memory performance for a long time. This growing gap between processor and memory performance means that today most processors spend much of their time waiting for data. Modern processors often have several levels of on-chip and possibly off-chip caches. These caches help reduce data access time by keeping frequently accessed lines in closer, faster caches. Data prefetching is the practice of moving data from a slower level of the cache/memory hierarchy to a faster level before the data is needed by software. Data prefetching can be done by software. Data prefetching can also be done by hardware. The software techniques and hardware techniques each have performance limitations.