1. Field
This disclosure relates generally to data processing systems, and more specifically, to processing decorated instructions with cache bypass.
2. Related Art
In a multiple processor or multiple core data processing system that implements a network, multiple counters are used to maintain statistics requiring a variety of functions such as increment, decrement and read-modify-write operations. Because multiple cores may attempt to update the identical counter at the same time, network delays are created and a significant amount of resources are consumed. A single communication link can generate a need for up to a couple hundred million counter updates per second where each update is modifying a prior data value. A mechanism for performing atomic updates, i.e. un-interruptible successive updates, is typically required. Conventional atomic update mechanisms, such as using a software semaphore or a software lock, can cause system delays. To reduce system delays, a statistics accelerator may be used. However, in a single transaction enough information cannot typically be sent to a statistics accelerator to describe an atomic operation.
Because the counter bit sizes can be larger than the size of registers within a processor, a lock variable has also been used to limit access to a counter while multiple storage accesses update sub-sections of the counter. When a core needs to gain ownership of a counter for an atomic update, a significant number of data processing cycles may pass for each lock variable. A processor must use processing cycles to obtain the lock variable, wait for the lock variable to be released if already taken by another processor, perform the counter update and release the lock variable. Thus the system speed and performance is degraded.