Modern semiconductor devices have a large programmable space that is used to enable features and provide debugging. This programmable space is programmed by software running on a combination of on and off chip processors. The programmable space is divided into subsections based on functionality. The subsections are further partitioned into logical register definitions that correlate with a corresponding feature set of the subsection. The lowest level of the programmable space is dictated by logical features and can have variable data widths. The width of a processor interface to the programmable registers is dictated by specifications of a processor and interconnect technology used between the processor and the programmable space.
Due to recent trends of multi-threading, virtualization, and the like, there can be multiple accesses to the programming space. These may or may not be coming from the same physical processor, which can be problematic when a wider end register is being accessed simultaneously by multiple sources in a piece meal manner. For example, if two threads are trying to access a 200-bit end register through multiple 32-bit transactions, then there needs to be some mechanism and check to ensure that all the 200-bits are updated from the same original source.
Semaphores are used between the sources to ensure that only one source is accessing the end registers at any time. Current implementations use two techniques to ensure atomicity. The atomicity can be implemented purely in software in some shared memory or can be hardware assisted by use of a register which ensures atomicity across a RMW (read modify write) access.
One limitation of the semaphore mechanism is that it limits performance by controlling how many sources can simultaneously access the end registers. Acquiring the semaphore adds overhead to each register access. This overhead comes even when two sources are not accessing the same end register. Assume two sources, namely SOURCE 0 and SOURCE 1, are trying to access two 64-bit registers, namely X and Y. FIG. 1 illustrates an exemplary sequence of operations 100 using the semaphore mechanism, which demonstrates unnecessary overhead. Unnecessary overhead affects performance.
In some applications, two sources may not be sharing the same software code base, which makes implementing the same semaphore logic even harder. One example of this scenario is when customer software is running on an external processor, while a debugging software is running on an embedded software.