1. Technical Field
The technical field relates to data processing and in particular to the field of maintaining coherency when sharing a data store between devices.
2. Background
It is known to provide systems with multiple processors sharing at least a part of the memory space. A potential problem with such a system is associated with the need to maintain a consistent view of the shared memory space between the multiple processors such that each processor can be confident that any data it stored in any address will be accessible later and will not be overwritten by another processor. This potential problem can be further exasperated by systems that have a hierarchical memory system with several memory levels. Within a multi processor apparatus such a system might have a main memory, and a level 2 cache shared between several processors and individual level 1 caches associated with individual processors. The level 1 caches provide fast access to data but are generally small, while the level 2 cache although slower than the level 1 caches provides a faster access to data than the memory does and is larger than the level 1 caches.
A drawback of these multiple memory levels is that there can be multiple versions of data items stored in the different levels of the memory system and it is important that coherency is maintained between them so that the system knows which is the most recent or valid version of the data item.
Such systems are often referred to as cache coherent systems, or SMP (Symmetric Multi-Processing) systems. There are multiple ways of arranging cache coherent systems to ensure coherency. One such system is a Write Invalidate system which is one that on a write request from one processor invalidates all the copies of this data located in any other private (or L1) cache of any other processor.
It is also known for such caches to operate in Write Allocate mode. In such a mode, a write request targeting data not stored within one cache will force this cache to fetch a cache line including this address from the next lower memory level, and to merge the data read with the data from the write request.
In order to make the best use of the cache storage facilities, it is also known to operate a level 2 cache and one or more level 1 caches in exclusive mode. In this mode cache control logic operates to ensure that each level 1 cache shares as little data as possible with the level 2 cache. The aim of this is to increase the amount of data stored in the caches by reducing duplication. Such a mode is common in multi-processor systems, where the amount of data stored by the multiple level one caches is considerable, when compared to the size of the level-two cache.
It is often desirable that some other devices of the system, although they themselves have no caches are able to access the shared storage in a coherent manner. Such other devices include for example DMA, hardware co-processors such as crypto-processors, Digital Signal Processors and video engines. Although these devices do not cache any data, it is important that a read from these devices to the shared storage returns the latest value, wherever the data is currently stored. This may be in the memory, in the L2 cache or in one of the private (or L1) caches of one of the processors. It is also important that a write from such devices to the shared memory does not affect the coherency of the system. This implies that such devices must be able to make coherent requests, although they do not receive any. Such devices are often connected to the system by what is referred to as a Coherent I/O. It should be noted that while coherent I/Os are discussed here in the field of multiple processor systems, this feature can be used in a single processor system where the cache of the processor is to be kept coherent with the main memory in regards of accesses made by these other devices.
On receipt of a read request from one of these devices, the data processing apparatus conventionally performs a lookup in all of the private caches where the data can be stored, to see if it is stored there. If the data is found in one of the caches, it will be transmitted to the device requesting the read, as it is believed to be the most recent version of the data. If it is not stored within any private caches, the read request will be transmitted to progressively lower levels of storage until it is found and the value will then be returned to the device.
On a write request from such a device, multiple cases must be considered. In the case that data located at the address of the write request is not stored in any of the private caches, the write request is simply forwarded to the next level of storage. In the case data located at the address targeted by the write request is present in at least one of the private cache, two cases can occur:                The data is marked in the cache as being dirty, in other words there is another copy of the data in a lower level of storage that is not current. In this case the cache line is evicted to the subsequent level of storage prior to performing the write request. It is understood that depending on the size of the write request, multiple cache lines, in different private caches, may be evicted.        The data is marked as being clean, which indicates that the value located at this address is consistent with the view of the lower storage level. The action taken in such a case is to invalidate the entry in any private caches containing this address, prior to performing the write to the shared level 2 cache. It must be noted that depending on the sizes of both the write request and the cache lines, multiple cache lines in different private caches may get invalidated.        
A drawback of the latter case is that particularly if the system is operating in write allocate mode, then as the data was present in a level 1 cache it is unlikely to be present in the level 2 cache, which means that the line must be fetched from memory before the write can be performed. This consumes time and power. The technology described in this application addresses these problems.