This application claims priority to GB Application No. 1104958.2 filed Mar. 24, 2011, the entire contents of which are incorporated herein by reference.
1. Field of the Invention
The present invention relates to the field of data processing and in particular to the scheduling of tasks to be executed by non-coherent devices.
2. Description of the Prior Art
Systems that have a plurality of devices that interact with each other and with a memory system are known. Where the different devices have their own local storage for the data, problems in maintaining data coherency between the different local storage locations and memory may be encountered. Generally devices address this problem by performing consistency operations such as cache maintenance operations, the use of barriers and the flushing of local storage at certain points in execution where it is important that data is coherent with at least a portion of the rest of the system. Particular problems can be encountered with devices such as graphic processing units GPUs that have long pipelines. Waiting for the execution of a task by the GPU while consistency operations are being performed introduces bubbles into the pipeline, which in deep pipelines has significant disadvantages.
Devices of the prior art have either performed the consistency operations immediately, which generally results in multiple updates and therefore has a bandwidth cost associated with it but has low latency, or they have waited and performed the consistency operations together in a batch process, which avoids unnecessary multiple updates but increases latency.
It would be desirable to both reduce the latency associated with such consistency operations without unduly affecting the bandwidth.