1. Field of the Invention
The invention relates generally to the field of computer systems. More particularly, the invention relates to computer systems containing a multiplicity of cooperating CPUs with resources shared between the CPUs wherein one or more CPU may attempt to access shared resources simultaneously, thereby leading to possible contention between CPUs.
2. Discussion of the Related Art
Computer systems containing a plurality of CPUs have, of necessity, included methods for synchronizing accesses to resources shared between two or more of the CPUs. These synchronizing methods are used to ensure that any such shared resources are always left in a coherent state after any CPU is finished utilizing them. In many cases, there may be multiple levels of synchronizing methods; e.g. a simple method such as a spinlock can insure that only one CPU inside an SMP is accessing a more elaborate method, such as a mutex, at any one time. These synchronizing methods can be very costly in terms of CPU cycles, thus potentially reducing overall system performance.
As an example, consider the case of a spinlock. When using a spinlock, a thread of execution must first make itself non-interruptible; this is a non-trivial operation on many modern operating systems. Once the thread is non-interruptible, it can attempt to acquire the spinlock, which will take an indeterminate amount of time depending on how many other CPUs are simultaneously requesting the spinlock. During this acquisition period, no other processing can occur on any of the CPUs waiting for the spinlock to become available. Once a CPU has acquired the spinlock, it can access the resource the spinlock is protecting, then release the spinlock and exit the non-interruptible code segment. The other CPUs continue to wait on the spinlock until it becomes available. Each, in turn, repeats the resource consumptive cycle of wait, acquire, access, and release.
A potentially more efficient use of spinlocks in a multiple CPU environment is obvious to one skilled in the art. The requesting thread can determine, after a certain number of attempts, that spinlock is in use and is likely to be in use for an indeterminate period. The thread can then use a local operating system synchronizing method, such as a timer-driven semaphore, to wake and periodically retry the acquisition of the spinlock. This allows other threads to run on the CPU during the wait period. The disadvantage this introduces is that the likelihood of a timely acquisition of the spinlock is reduced, and the latency until acquisition is greatly increased, depending on the granularity of the wait mechanism.
Some computer systems containing a multiplicity of cooperating CPUs have dedicated hardware designed purposely to aid in the cooperation of the CPUs. This dedicated hardware takes many forms, such as RAM accessible from multiple CPUs, unique hardware to perform common functions applicable to multiple CPUs, and other unique hardware to aid in synchronizing multiple CPUs. Any or all of these hardware functions could appear in a given design, and the implementation could be performed in any combination of fixed function devices such as ASICs, programmable devices such as FPGAs, or even one or more additional CPUs. Hardware such as this, which is accessible from more than one CPU, will be referred to as a shared resource.
Each CPU will have one or more methods for accessing these shared resources. In many systems, even though the CPUs may have instructions designed to help manage shared resources, the access to the shared resource will ultimately be achieved by the execution by the CPU of a memory or I/O access instruction. This memory or I/O access instruction will typically be directed at a device or set of devices which are designed to connect to, and accept memory or I/O instructions from, a multiplicity of CPUs. This device, or set of devices, may connect to the CPUs through one or more communication channels. These communication channels consist of the data and/or control signals and supporting devices that eventually connect a given CPU to the shared resources. Each of these communication channels may support a plurality of CPUs.
A problem with this technology has been contention between CPUs for access to a shared resource. This problem can be alleviated through methods that may be obvious to one skilled in the art, however, a simple solution to the problem of contention between CPUs gives rise to a second problem of increased latency.
In a computing system containing a multiplicity of CPUs, there is a requirement to provide a mutual exclusion mechanism for regulating access of shared resources spanning the CPUs. Additionally, each CPU has its own Operating System thread synchronization methods and requirements for management of system-local resources. What is needed is a solution that combines these two requirements in a low-latency design with minimum overhead expense.
There is a need for the following embodiments. Of course, the invention is not limited to these embodiments.
According to a first aspect of the invention, a method comprises: restricting access to a protected shared resource by use of a lock; issuing the lock to a requesting software to permit access to the protected shared resource; indicating the issuance of the lock to the requesting software by writing a first value to a lock register; freeing the lock, thereby making the lock available for use by another requesting software, after the requesting software completes accessing the protected shared resource; and indicating that the lock is free by writing a second value to the lock register. According to another aspect of the invention, a method, comprises: receiving a request from a requesting software to access a protected shared resource, the protected shared resource currently being accessed by another requesting software; designating a proxy value to represent the requesting software; adding the proxy value to a queue of proxy values contained in a lock register; suspending execution of the requesting software; determining when the protected shared resource is no longer being accessed by the another requesting software; if the proxy value representing the requesting software is first in the queue of proxy values, resuming execution of the requesting software using the proxy value; if the proxy value representing the requesting software is first in the queue of proxy values, allowing the requesting software to access the protected shared resource using the proxy value; if the proxy value representing the requesting software is first in the queue of proxy values, upon completion of access to the protected shared resource by the requesting software, removing the proxy value representing the requesting software from the queue of proxy values; and if there is no next proxy value in the queue of proxy values, writing a release value to the lock register. According to another aspect of the invention, a method comprises: receiving a request from a requesting software to access a protected shared resource, the protected shared resource not being used by another requesting software; acquiring a lock from a lock register to permit the requesting software to access the protected shared resource; accessing the protected shared resource; and upon completion of access to the protected shared resource by the requesting software, writing a release value to the lock register. According to another aspect of the invention, an apparatus comprises: a shared resource logic control including a lock and a lock register containing a queue of proxy values; a first requesting central processing unit coupled to the shared resource logic control; a second requesting central processing unit coupled to the shared resource logic control; and a protected shared resource, coupled to the shared resource logic control.