1. Field of the Invention
The invention relates generally to the field of digital data processing systems, and more specifically to mechanisms for controlling access to code and data which may be shared in a digital data processing system including multiple processors.
2. Description of the Prior Art
A digital data processing system includes three basic elements, namely, a processor element, a memory element and an input/output element. The memory element stores information in addressable storage locations. This information includes data and instructions for processing the data. The processor element fetches information from the memory element, interprets the information as either an instruction or data, processes the data in accordance with the instructions, and returns the processed data to the memory element for storage therein. The input/output element, under control of the processor element, also communicates with the memory element to transfer information, including instructions and data to be processed, to the memory, and to obtain processed data from the memory.
Typically, an input/output element includes a number of diverse types of units, including video display terminals, printers, interfaces to the public telecommunications network, and secondary storage subsystems, including disk and tape storage devices. A video display terminal permits a user to run programs and input data and view processed data. A printer permits a user to obtain processed data on paper. An interface to the public telecommunications network permits transfer of information over the public telecommunications network.
To increase processing speed, digital data processing systems have been developed which include multiple processors. Such multi-processing systems are generally organized along two paradigms for controlling operations within a system In one paradigm, called "master-slave", one processor operates as a master processor, essentially assigning jobs to the other processors, which operate as slave processors. The master processor may also perform similar jobs as a slave processor while it is not performing its assignment functions. Control is simplified in systems designed along the master-slave paradigm since a single processor, namely, the master processor, is responsible for assigning the jobs. However, in such systems, if the master processor malfunctions, the entire system may be inoperative. In addition, under heavy processing loads, the master processor may become overloaded, which will slow down assignments of jobs to the slave processors.
Problems with systems designed along the master-slave paradigm do not arise in systems designed along the second paradigm, in which assignment of work is handled in a more homogeneous manner. In this paradigm, jobs are identified in a list stored in memory which may be accessed by any processor in the system. When a processor becomes available, it may retrieve an item from the job list for processing. Loading items onto the job list is, itself a job which can be performed by any of the processors, thus control of the job list is also decentralized among all of the processors. Since all of the processors can perform these control functions, if any of them malfunctions the system can remain operative, although at a reduced processing speed.
While decentralization of the control functions in a multiple processing system provides some advantages over systems employing master-slave control, decentralized systems can also have problems if the operating system, the program which controls the processors and job scheduling, does not provide good coordination and communication among the processors. It is necessary, in a decentralized system, to ensure that, for example, two processors do not attempt to execute the same critical section or region at the same time. A critical region is a portion of a program in which memory shared among the processors is accessed [see, for example, A. Tanenbaum, Operating Systems: Design and Implementation, (Prentice-Hall, 1987), at page 53]. If two processors attempt to execute a critical region of a program at the same time, they may access data in the same storage location in an overlapping, rather than sequential, manner, which will result in an erroneous result. This problem can occur if the system does not provide good synchronization among critical regions.
Typically, flags are used to provide synchronization of access to critical regions of programs and of shared data structures processed thereby. The flags, which comprise storage locations in memory which are shared among processors in the system, can be used to indicate the status of a critical region that is, whether or not a critical region, is being executed. When a processor wishes to execute a critical region, it can set the flag associated with the critical region to inform other processors that the critical region is being executed. If another processor wishes to execute the same critical region, it determines the condition of the flag, and, if the flag does not indicate that the critical region is being executed by another processor, may itself execute the critical region, first conditioning the flag to indicate that the critical region is being executed. On the other hand, if the flag does indicate that the critical region is being executed by another processor, the processor wishing to execute the critical region delays, continuing to test the flag until it is changed to indicate that the critical region is not being executed by another processor.
The use of flags to control access to a shared critical region does create several problems. One problem, termed a "race" condition, may occur if two processors request the same critical region at the same time. If neither is able to condition the flag to indicate that the critical region is being executed before the other tests the condition of the flag, both may execute the critical region. Another problem, termed "deadlock" occurs when two processors need to execute the critical region currently being processed by the other. Since neither can release the critical region each is executing, neither can begin executing the other critical region. As a result, both processors are deadlocked.
To alleviate race and deadlock problems, more sophisticated control mechanisms, known as semaphores, have been developed. A semaphore manages control of the synchronization flags and gives permission to one processor if several request access to the same critical region at the same time. When a processor finishes execution of a critical region, it informs the semaphore, which is responsible for conditioning the flags. A problem arises, however, since, if a processor is denied access to a critical region, it may continually attempt to obtain permission from the semaphore until it gives permission to the processor to execute the critical region. If this occurs with sufficient numbers of the processors in the multiple processor system, the communications system in the digital data processing system may be so overloaded with requests that no other communication can take place. At this point, the system is effectively unable to perform processing work.