The present invention relates to a computer bus structure for performing synchronization in a multiprocessor system with shared resources. More particularly, the present invention relates to a logic interface that supports atomic read-modify-write operations as well as normal read/write access to the shared memory of a multiprocessor environment using the redundant addressing of memory so as to select the desired function.
Process synchronization refers to the coordination of the concurrent activities of two or more processors of an embedded processor system. The processors involved in synchronization function with logic indicating the state of the process and cooperatively coordinate their actions by waiting on a condition that is capable of being set by any one of the competing processors. In the present invention, the condition is a variable that signifies the availability of a shared resource.
A specific problem of synchronization is mutual exclusion; a problem solved by the present invention by preventing two or more concurrent activities from simultaneously accessing a shared resource. The shared resource may be memory that is shared among a set of processors where the instructions that access the shared memory form a critical region. The critical region refers to a series of processor operations that must be executed in their entirety without interruption by another processor. The concern is that two or more processors with equal and unrestricted access to a memory, or a shared resource more generally, could result in one processor writing, modifying, or otherwise interfering with the data stored by another processor. In addition to coordinating the concurrent activities of multiple processors, multiple processes operating on a single processor may also compete for a resource.
The present solution to the mutual exclusion problem guarantees that once a processor of a multi-processor system has asserted control over the resource, any processor which subsequently requests access is denied simultaneous access if such use is inconsistent with the first processor""s use of the resource. For example, where a first processor is modifying data in shared memory, every other processor must be denied concurrent access. In the alternative, limited concurrent access may be permitted, for example, where multiple processors require the resource for purposes of reading a memory location; such use being a consistent use.
A common approach to performing mutual exclusion is to use special instructions provided by the processor hardware called hardware primitives. Hardware primitives are the basic building blocks used to form a variety of user-level synchronization operations. The fundamental requirement of a primitive is that it must provide a processor with the capability to atomically read and modify a memory location. The instruction is atomic if the primitives are indivisible such that the execution of no other instructions can occur between the given pair of primitives. Indivisible operations prevent multiple processors from concurrently gaining access to a shared resource intended for exclusive use by only one processor at a time. Thus, atomic instructions by a first processor prevent another processor from changing the memory in the interim between the steps of a read and modify operation in which a first processor takes possession of the resource.
There are a number of read-modify-write primitives that can be used to implement synchronization. One such instruction is xe2x80x9ctest-and-setxe2x80x9d which uses a mutual exclusion variable, commonly called a mutex or lock, to signify the availability of the shared resource to the processor. The mutex variable acts as a flag and coordinates multiple processors that may attempt to concurrently access the resource. A mutex variable is created for each resource where access must be restricted.
The test-and-set instruction is comprised of a read operation immediately followed by a write operation in which a lock variable is modified contingent upon the value initially read. The xe2x80x9ctestxe2x80x9d of the mutex value is initiated by means of a read operation generated by a processor. The mutex is then compared against a predetermined value which, by convention, indicates the status of the resource. If the test of the lock indicates that the resource is free, a xe2x80x9csetxe2x80x9d operation is executed by means of a write instruction by which the lock variable is modified to indicate the updated status of the resource. The operation, for example, could test for xe2x80x9c0xe2x80x9d and then set the mutex to xe2x80x9c1xe2x80x9d in order to lock out other processors from accessing the resource. In this manner a processor asserts possession of a resource. A second processor attempting to gain access to the resource would therefore have logical notice that the resource is unavailable, thereby locking out the second processor until the processor in possession releases the resource by resetting the mutex value to its original value. It is critical that the read and write operations be performed in an atomic manner in order to prevent another processor from initiating a test-and-set instruction prior to completion of the primitive.
Mutual exclusion locks are well known in the prior art, and various other methods for implementing synchronization include software and operating system support. Software-based synchronization solutions use global variables to control access to critical regions, but are not extendable to multiprocessor systems. Another prior art solution uses semaphores implemented in the operating system with lower level primitives such as the test-and-set instruction. A significant disadvantage of lower level primitives is that they typically cannot be implemented on a system with incompatible processor types.
U.S. Pat. No. 5,287,503 to Narad implements an aliasing scheme for performing bit storage and bit manipulation of data using a physical atomic access register structure comprised of an array of flip-flops. A single physical atomic access register is accessible by means of three unique addresses which, after decoding, allow a processor to atomically access the shared register in order to read, set, or clear a bit by executing an access request to the corresponding address. While individual bits of data in the Narad access register can be atomically modified, the system is incapable of atomically performing a read and conditional write of a bit, thereby making it ineffective as a mutual exclusion apparatus.
U.S. Pat. No. 5,276,886 to Dror claims an interface structure for storage of hardware semaphores in which flag variables are stored in multiple 1-bit registers to which multiple processors have access. The disadvantage of the semaphore register is that it is a dedicated memory device capable of coordinating only as many shared resources as there are register structures. Since the Dror interface fails to provide conventional read and write access, the semaphore cannot be interleaved with the data it regulates where the shared resource is memory. As such, the Dror interface requires additional complexity in design while reducing the versatility of the system.
It is the object of this invention to provide read-modify-write functionality for the processors of a multiprocessor system. This functionality is particularly useful for processors which do not support indivisible memory operations necessary for performing mutual exclusion or processors which practice incompatible forms of mutual exclusion functionality in a multiprocessor system.
Another object of the present invention is to provide an apparatus for and method of performing atomic read-modify-write functionality on lock variables stored within the shared memory of any existing form of memory device without compromising conventional (i.e., general purpose storage) read and write access.
The present invention relates to a method and apparatus for providing synchronization support using a memory interface and extended address space, permitting both conventional read and write access as well as atomic read-modify-write operations on lock variables stored in memory to which each processor of the multiprocessor system has access.
The present invention provides enhanced performance over standard memory interfaces by permitting the implementation of read-modify-write functionality, addressable through a multiplicity of unique system bus addresses that map into the physical memory. The interface distinguishes a read-modify-write command from a conventional read instruction to the physical memory by decoding the bus address. A read-modify-write operation causes the mutex variable to be read and subsequently modified by the logic circuitry depending on the present value of the mutex variable. The value of the lock variable in turn is dependent upon the number of processors accessing the resource to which the lock variable corresponds, as well as the nature of the use of the resource.
In the first of two embodiments, the read-modify-write operation takes the form of a test-and-set operation performed in response to a processor-generated read of a mutex variable. The test-and-set operation consists of: (a) a processor-generated read command to an upper memory address that aliases the physical address typically accessible only through conventional read/write operations and (b) a conditional write command automatically executed by the logic circuitry of the interface. This upper memory address permits the physical memory to emulate a non-existent memory device that would otherwise be dedicated to the purpose of storing lock variables.
A processor obtains exclusive use of the resource by locking the mutex variable. To acquire a resource a processor executes a test-and-set operation on a mutex variable to determine the availability of the associated resource and to perform a lock of the resource if the resource is free. Until the processor in possession of the resource resets the mutex variable, a second processor is prohibited from accessing the resource in a manner that might interfere with the first processor""s use of the resource. Only upon the release of the lock variable will the second processor be able to set the mutex variable to signify its exclusive use of the resource.
The shared resource interface frees the software and operating system from having to run a lock program, thereby eliminating the overhead and consumption of resources associated with the execution of such software. The shared resource interface eliminates the need for a separate memory device dedicated to storing mutex variables, and permits the lock variable to be placed in memory contiguous with the resource when the resource is the memory itself. Furthermore, the shared resource interface can be implemented on a diverse range of processors and systems including those that do not support atomic synchronization, without extending the system bus with additional control lines.
A second embodiment is described in which the read-modify-write operation takes the form of a fetch-and-increment instruction. In this embodiment, multiple processors are granted concurrent use of a shared resource on the condition that concurrent use is compatible with and avoids interference between all processors within the multiprocessor system.