In computer science, caches are used to reduce the number of accesses to main memory and to reduce the latency associated with data retrieval. Essentially, a cache is smaller and faster than main memory and can be used to store copies of frequently accessed data.
An address tag is associated with each cache line. Each address tag usually includes a portion (e.g., some number of the most significant bits) of a main memory address. A request for data will identify the main memory address where the data can be found. An address tag is derived from the main memory address and compared to the address tags in the cache. If there is a match (e.g., a cache hit), then the data can be fetched from the cache. If there is not a match (e.g., a cache miss), then the data is fetched from main memory and written to one of the cache lines.
A built-in delay (e.g., a first-in first-out buffer, or FIFO) is usually included between the stage in which the address tag is derived and the cache. In the event of a cache miss, it takes some amount of time to retrieve data from main memory and load it into one of the cache lines. Therefore, the request for data is delayed from reaching the cache until the fetch from main memory can occur, so that the data requested will be available when the request arrives at the cache.
However, care should be taken to not overwrite the current data in a cache line with new data until the last reference to the current data has been dispatched (referred to as a write-before-read hazard). One mechanism for avoiding the write-before-read hazard is to incorporate a second FIFO between main memory and the cache. When a request for data results in a cache miss, the second FIFO delays the writing of new data from main memory to the cache. This allows time for preceding requests to be cleared through the cache before any new data is introduced into the cache.