This is the first application filed for the present invention.
Not applicable.
This invention relates in general to shared memory systems for use in parallel processing environments and, in particular to methods and systems for process rollback in a shared memory parallel processor environment.
The rapid growth in the Public Switched Telephone Network (PSTN), especially the rapid expansion of service features has strained the processing capacity of incumbent switching equipment. This is particularly the case in wireless telephony environments where messaging loads between mobile switching centres are intense. As is well known, most incumbent switching systems in the PSTN have processing architectures that are based on a single central control component that is responsible for all top level processing in the system. Such single central control component architectures provide the advantage to application programmers of some simplification with respect to resource control, flow control and inter-process communication. However, single central control component architectures are subject to serious bottlenecks due principally to the fact that each process is dependent on the capacity of the single core processor. There has therefore been an acute interest in developing parallel processor control for incumbent switching systems to improve performance and permit the addition of new processor-intensive service features.
Parallel processor architectures are well known. However, the software written for such architectures is specifically designed to avoid processor conflicts while accessing shared resources such as shared memory. This is accomplished by providing exclusive access to the memories using software semaphores or methods for locking memory access buses, and the like. However, incumbent switching systems in the PSTN were typically written for a central control component, and in many cases it is not economically feasible to rewrite the application code for a parallel processor architecture. Aside from the complexity of such a rewrite, the time and cost incurred to complete such a task is generally considered to be prohibitive.
It is known in the art that when a shared memory parallel processor computing environment is used to execute code written for a single central control component, two processes can compete for a memory space in the shared memory. This competition is called blocking. Because rights to a memory space cannot be granted to more than one process at a time, one process must be xe2x80x9crolled backxe2x80x9d while the other process is permitted to continue execution.
A shared memory control algorithm for mutual exclusion and rollback is described in U.S. Pat. No. 5,918,248, which issued on Jun. 29, 1999 to the Assignee. The patent describes a mechanism for permitting a shared memory single central control component parallel processing architecture to be used in place of a conventional system, without requiring code written for the conventional system to be rewritten. Exclusive Access and Shared Lead Access implementations are disclosed. A rollback mechanism is provided which permits all the actions of a task in progress to be undone. The memory locations of that parallel processor architecture include standard locations and shared read locations. Any task is granted read access to a shared read location, but only a single task is granted write access to a shared read location at any given time.
A prior art rollback mechanism designed by the Assignee uses three priority levels (0, 1 and 2). When two processes compete for the same memory space, the process with the higher priority is permitted to continue execution and the process with the lower priority is rolled back. Initially, each process is assigned a default priority value of zero. When two processes having zero priority compete for a same memory space, the processes are executed on a first-in-first-out basis. The process that is rolled back then has its priority set at 1. If the same process is rolled back a second time, due to competition with another priority 1 process, the priority of the process is set at 2, which is the highest priority permitted. The scheduler ensures that only one priority 2 process is allowed to execute on the system at any one time. After the process has reached a commit point, the priority associated with the process is reset to zero.
While this algorithm represents a significant advance in the art, the rollback mechanism has not proven to support optimal performance. Performance is compromised for the principal reason that processes belonging to large classes are rolled back too often to meet their CPU time requirement.
It is therefore highly desirable to provide a method and system for rolling back processes in a shared memory, parallel processor computing environment that enhances performance by ensuring that access to computing resources is optimized.
It is therefore an object of the invention to provide methods and systems for process rollback in a shared memory, parallel processor computing environment that enhances performance by ensuring that processes are rolled back in proportion to their allotted processing time.
In accordance with a first embodiment of the invention, there is provided a method for process rollback in a shared-memory parallel-processor computing environment in which the parallel processors are operated concurrently and each processor sequentially runs processes. In accordance with the method, when two processes compete for a memory space in the shared memory, one of the processes is rolled back. The process that is rolled back is the process that has a lower priority value, or if the two processes have the same priority value, the process that collided with an owner of the memory space is rolled back. A process collides with the owner of the memory space if it attempts to access the memory space when the owner has possession of the memory space.
When a process is rolled back, a new priority value is computed for the rolled-back process. The new priority value is computed for the rolled-back process by incrementing the processes priority value by a predetermined amount. If the two processes are members of different classes, the predetermined amount is preferably a priority value assigned to a class of which the rolled-back process is a member. If the two processes are members of the same class, the predetermined amount is preferably less than the priority value assigned to the class. When a process reaches a commit point, the priority value of the process is reset to a priority value assigned to the class of which the process is a member.
The priority value assigned to the class is preferably related to a proportion of processor time allocated to the class. The priority value may be directly proportional to the processor time allocated to the class, for example. In accordance with the first embodiment of the invention, the priority value of each process is stored in a process control block associated with the process.
The invention also provides a shared memory parallel processor system for executing processes concurrently, comprising means for storing a priority value associated with each process; means for determining which one of two processes is to be rolled back using the priority values associated with each of the two processes when the two processes compete for a memory space; and, means for computing a new priority value for the process that is rolled back.
The means for determining which process is to be rolled back, comprises means for selecting the process that has a lower priority value, when the two processes have different priority values; and, means for selecting the process that collided with an owner of the memory space, when the two processes have the same priority value. When a collision occurs because two processes compete for a memory space, the system determines a class of which each process is a member. The system further comprises means for computing a new priority value for the rolled-back process, by incrementing the priority value by a predetermined amount. The predetermined amount is a first amount if the processes belong to different classes, and a second amount if the processes belong to the same class. The first amount is preferably a priority value associated with a process class of which the process is a member.
In accordance with a second embodiment of the invention, there is provided a method for process rollback in a shared memory parallel processor computing environment in which the processors run processes concurrently, and each process is a member of one of a plurality of process classes. The method comprises steps of maintaining a pair of variables for each pair of process classes, the variables storing a current priority value for each process class in each process class pair. When two processes that are members of different process classes compete for a same memory space, a collision count stored in a respective process control block of each process is examined to determine whether either collision count exceeds a first engineered collision threshold. If either collision count exceeds the first collision threshold, the process with the lowest collision count is rolled back and the other process is permitted to continue execution. If neither collision count exceeds the threshold, current priority values of the respective class pair are used to determine which process is rolled back. The current priority values are compared and the process that is rolled back is one of: a) the process that is a member of the class that has a lower current priority value; and, b) if the two classes have the same current priority value, the process that collided with an owner of the memory space.
When a process is rolled back a new priority value is stored in the variable for the class priority of which the rolled-back process was a member. A new priority value is stored in the variable by incrementing the variable by an amount equal to a base priority value stored in a process class parameter file.
The collision count for each process is stored in a process control block associated with each of the respective processes. The collision count associated with the rolled-back process is incremented each time the process is rolled back. When a collision count associated with a process exceeds an engineered second collision threshold, the process is run without competition until it commits, and the collision count is reset to zero.
When two processes that are members of the same class compete for the same memory space, the process that collided with an owner of the memory space is rolled back and a collision count associated with the rolled-back process is incremented. Each time a process is scheduled to run, a value of the collision count is compared with the second collision threshold, and if the collision count exceeds the second collision threshold, the process is permitted to run without competition. The collision count is reset to zero after the process reaches a commit point.
The invention further provides a shared-memory parallel-processor computing apparatus in which the processors run processes concurrently, and each process is a member of one of a plurality of process classes. The apparatus comprises means for storing a pair of variables for each pair of process classes, the variables storing a variable priority value for each process class in each process class pair. The apparatus further comprises means for determining, using the respective priority values, which process is rolled back when two processes that are members of different process classes compete for a memory space. The system also comprises means for computing and storing a collision count associated with each of the processes. The means for computing and storing the collision count preferably stores the collision count in a process control block associated with each of the respective processes. The means for computing and storing the collision count increments the collision count associated with the rolled-back process when a process is rolled back.
The means for determining which process is rolled back selects one of: the process with the lowest collision count if the collision count associated with either process exceeds a first collision count threshold, and if neither collision count exceeds the first threshold, the process that is a member of the class that has a lower priority value unless the two classes have the same priority value, in which case, the process that collided with an owner of the memory space is rolled back.
The system further comprises means for storing a new priority value in the variable for the class of which the rolled-back process is a member. The means for storing a new priority value in the variable adds, to the value of the variable, an amount equal to a base priority value stored in a process class parameter file.
The invention therefore provides a parallel processor/shared memory computing system that ensures that critical processes are guaranteed adequate processor time, while also ensuring that less critical processes are not completely starved. Process execution is dynamically adjusted to ensure equitability of access to computing resources. A system controlled by the methods in accordance with the invention is therefore ensured of more stable operation, and functionality is improved.