In certain applications, software executing on one processing component may be required to access data that is updated by software executing on another processing component. FIG. 1 illustrates a simplified block diagram of a conventional processing system 100 whereby software executing on a first processing component consisting of a central processing unit (CPU) 110 is required to access data within external memory 120 that is updated or otherwise modified by software executing on a second processing component consisting of an integrated processing core of a communications hardware module 130.
A problem occurs when multiple threads executing on the CPU 110 are required to concurrently access the data within the external memory 120, or if a single thread executing on the CPU 110 is required to perform, for example, a read-modify-write operation on the data. Because such scenarios require multiple, temporally dislocated accesses of the data by the CPU 110, there is a possibility that the data may be updated or otherwise modified by the second processing component 130 between the accesses of the data by the CPU 110.
For example, the CPU 110 issues a read-modify-write transaction to sample and clear a counter value 125 in external memory 120. Such a counter value 125 might be for a number of packets received that has a direct correlation with performance or bandwidth used. The CPU 110 thus is arranged to read the counter value 125 at the end of a user's service or a measurement over a specific period of time. After reading the counter value 125 it is reset, ready to be used when the next user starts or when the next measurement period begins.
In response to the read-modify-write transaction being issued by the CPU 110, the counter value 125 is read into a register 115 of the CPU 110. Meanwhile, the integrated core 130 on the communications hardware reads the counter value 125, following which a write transaction from the CPU 110 clears (resets) the counter value 125. The integrated core 130, unaware of the counter value 125 being cleared by the CPU 110, increments the original counter value 125 that it previously read and writes it back to external memory 120. As a result, the counter value 125 within external memory 120 is no longer valid and has become indeterminate.
In order to avoid such situations occurring, synchronization is required between the two processing components 110, 130 to ensure the data within the external memory remains coherent, deterministic and uncorrupted.
Conventional systems rely on cache algorithms and features to guarantee coherency of data accessible my multiple processing components. Such systems use schemes that each processing component must be aware of and adhere to, and as a consequence such systems are generally limited to being homogeneous in nature, with all processing components containing the same hardware for coherency and using the same bus and associated signals, snooping mechanisms etc.
However, conventional approaches that rely on rely on cache algorithms and features would be prohibitively expensive and complex to implement in heterogeneous processor systems, such as the system 100 illustrated in FIG. 1.