The present invention relates generally to multicomputer systems, and more particularly, to such employing a microkernel-based serverized distributed operating system and to associated methods; as well as to such with a distributed process directory.
Description of the Related Art
Microkernel-based operating system architectures have been employed to distribute operating system services among loosely-coupled processors in a multicomputer system. In an earlier system, a set of modular computer software-based system servers sit on top of a minimal microkernel which provides the system servers with fundamental services such as processor scheduling and memory management. The microkernel may also provide an inter-process communication facility that allows the system servers to call each other and to exchange data regardless of where the servers are located in the system. The system servers manage the other physical and logical resources of the system, such as devices, files and high level communication resources, for example. Often, it is desirable for a microkernel to be interoperable with a number of different conventional operating systems. In order to achieve this interoperability, computer software-based system servers may be employed to provide an application programming interface to a conventional operating system.
The block diagram drawing of FIG. 1 shows an illustrative multicomputer system. The term xe2x80x9cmulticomputerxe2x80x9d as used herein shall refer to a distributed non-shared memory multiprocessor machine comprising multiple sites. A site is a single processor and its supporting environment or a set of tightly coupled processors and their supporting environment. The sites in a multicomputer may be connected to each other via an internal network (e.g., Intel MESH interconnect), and the multicomputer may be connected to other machines via n external network (e.g., Ethernet for workstations). Each site is independent in that it has its own private memory, interrupt control, etc. Sites use messages to communicate with each other. A microkernel-based xe2x80x9cserverizedxe2x80x9d operating system is well suited to provide operating system services among the multiple independent non-shared memory sites in a multicomputer system.
An important objective in certain multicomputer systems is to achieve a single-system image (SSI) across all sites of the system. From the point of view of the use, application developer, and for the most part, the system administrator, the multicomputer system appears to be a single computer even though it is really comprised of multiple independent computer sites running in parallel and communicating with each other over a high speed interconnect. Some of the advantages of a SSI include, simplified installation and administration, ease-of-use, open system solutions (i.e., fewer compatibility issues), exploitation of multisite architecture while preserving conventional API""s and ease of scability.
There are several possible component features that may play a part in a SSI such as, a global naming process, global file access, distributed boot facilities and global STREAMS facilities, for example. In one earlier system, a SSI is provided which employs a process directory (or name space) which is distributed across multiple sites. Each site maintains a fragment of the process directory. The distribution of the process directory across multiple sites ensures that no single site is unduly burdened by the volume of message traffic accessing the directory. There are challenges in implementing a distributed process directory. For example, xe2x80x9cglobal atomic operationsxe2x80x9d which must be applied to multiple target processes and may have to traverse process directory fragments on multiples sites in the system. This traversal of directory fragments on different sites in search of processes targeted by an operation can be complicate by the migration of processes between sites in the course of the operation. In other words, a global atomic operation and process migration may progress simultaneously. Thus, there may be a particular challenge involved in ensuring that a global atomic operation is applied at least once, but only once, to each target process.
The problem of a global atomic operation potentially missing a migrating process will be further explained through an example involving the global getdents (get directory entries) operation. The getdents operation is a global atomic operation. The timing diagram of FIG. 2 illustrates the example. At time=t, process manager server xe2x80x9cAxe2x80x9d (PM A) on site A initiates a migration of a process from PM A on site A to the process manager server xe2x80x9cBxe2x80x9d (PM B) on site B (dashed lines). Meanwhile, an object manager server (OM) has broadcast a getdents request to both PM A and PM B. At time=t1, PM B receives and processes the getdents request and returns the response to the OM. This response by PM B does not include a process identification (PID) for the migrating process which has not yet arrived at PM B. At time=t2, PM B receives the migration request from PM A. PM B adds the PID for the migrating process to the directory fragment on site B and returns to PM A a response indicating the completion of the process migration. PM A removes the PID for the migrating process from the site A directory fragment. At time=t3, PM A receives and processes the getdents request and returns the response to the OM. This response by PM A does not include the PID for the migrating process since that process has already migrated to PM B on site B. Thus, the global getdents operation missed the migrating process which was not yet represented by a PID in the site B directory fragment when PM B processed the getdents operation, and which already has its PID removed from the site A directory fragment by the time PM A processed the getdents operation.
A prior solution to the problem of simultaneous occurrence of process migrations and global atomic operations involved the use of a xe2x80x9cglobal ticketxe2x80x9d (a token) to serialize global operations at the system level and migrations at the site level. More specifically, a computer software-based global operation server issues a global ticket (a token) to a site which requests a global operation. A number associated with the global ticket monotonically increases every time a new ticket is issued so that different global operations in the system are uniquely identified and can proceed one after the other.
Global tickets are used to serialize all global atomic operations so that they do not conflict among themselves. However, a problem remains between global operations and process migrations. A prior solution makes global operations result in a multicast message carrying the global ticket to process managers on each site. Each process manager would then acquire the lock to the process directory fragment of its own site and iterate over all entries. The global operation to the entry""s corresponding process is only performed if a global ticket number marked on the entry is lower than the current iteration global ticket number. A global ticket number marked on a process directory fragment entry is carried over from a site the process migrates from (origin site) to a site the process migrates to (destination site). It represents the last global operation ticket such process has seen before the migration.
The migration of a process is a bit more complex. The process being migrated acquires the process directory fragment lock on its origin site first. It then marks the corresponding process directory entry as being in the process of migration. The migration procedure stamps the process"" process directory entry with the present global operation ticket number, locks the process directory on the migration destination site and transmits the process directory entry contents to the destination site. The global operation ticket number on the destination site is then copied back in the reply message to the migration origin site. The migration procedure on the origin site is responsible for comparing the returned global ticket number from the target site and its own. If the global ticket number of the origin site is greater than the number from the target site, then the global operation already has been performed on the migrating process, although the operation has not yet reached the target site. The migration is permitted to proceed, but the process directory fragment slot for the migrating process on the target site is marked with the higher global ticket number. As a result, the global process will skip the migrated process on the target site and not apply the global operation twice to that process. If the global ticket number of the origin site is less than the number from the target site, then a global operation has been performed on the target site and has yet to be performed on the origin site and will miss the process currently being migrated. The migration will be denied and retried later.
Unfortunately, there have been problems with the use of global tickets (tokens) to coordinate global operations and process migrations. For example, the global ticket scheme serializes global operations since only one global operation can own the global ticket at a time. The serialization of global operations, however, can slow down overall system performance. While one global operation has the global ticket, other global operations typically block and await their turns to acquire the global the ticket before completing their operations.
Thus, there has been a need for improvement in the application of global atomic operations to processes that migrate between sites in a multicomputer system which employs a microkernel-based serverized operating system to distribute operating system services among loosely-coupled processors in the system. The present invention meets this need.