In a data processing system, a processor may be associated with one or more cache storage devices. These cache storage devices together with system memory are usually organized into hierarchies by size and/or speed to hold copies of more frequently used or more immediately required data. Such copies, when written to or when modified by a processor or processors may differ from other corresponding copies at various layers in the hierarchy. Therefore it is usually necessary to maintain coherence among the various copies.
Typically cache storage devices are organized internally into lines of sequential bytes of data, for example. When a new cache line is allocated, a line-fill from memory or from another cache storage in the hierarchy is typically requested. In many common cases, such organization may facilitate efficient prefetching of instructions and/or data during execution of a program or process.
In a multiprocessor cache coherent system, it may be generally assumed necessary for an agent to gain exclusive ownership of a cache line, before writing to and modifying that line. For example, another agent may already have a modified copy of the line in its local cache, and the portion of the line to be modified may range from a single byte up to the entire cache line. Therefore, an up-to-date copy of the line would be requested so that any partial line modifications can be merged with the most recent copy of the line. Other agents would also be notified of the change in status for the line.
Exclusive ownership may be achieved, for example, by generating an invalidating read request for the data. Such a request has two affects. It obtains the latest copy of the line from the other caching agents or memory. It also serves to invalidate all other copies of the line, so that the line can be exclusively owned and ready for modification by the requesting agent.
In certain specific applications, the portion of a cache line to be modified may most typically be an entire line. A graphics or video application, which writes to a display frame buffer may be an example of such an application. When a cache line corresponding to a frame buffer memory location is allocated, data that is loaded from the frame buffer memory location to fill the cache line may be completely overwritten with new data. Similarly, when a previously modified copy of the cache line is loaded from another cache in the hierarchy, it too may be completely overwritten. In such cases system bandwidth and power are wasted transferring unnecessary data. System performance may, therefore, suffer.
A typical multiprocessor cache coherent system may employ a cache coherence protocol, such as MESI, or MOESI and/or snoop response signals such as HIT to indicate whether or not an agent has a copy of the data and HITM to indicate whether or not it is modified. These two snoop response signals alone do not provide enough information to identify whether or not a data transfer is really warranted.