A computer in operation includes hardware, software, and data. The hardware typically includes a processor, memory, storage, and I/O (input/output) devices coupled together by a bus. The software typically includes an operating system and applications. The applications perform useful work on the data for a user or users. The operating system provides an interface between the applications and the hardware. The operating system performs two primary functions. First, it allocates resources to the applications. The resources include hardware resources—such as processor time, memory space, and I/O devices—and software resources including some software resources that enable the hardware resources to perform tasks. Second, it controls execution of the applications to ensure proper operation of the computer.
Often, the software is divided conceptually into a user level, where the applications reside and which the users access, and a kernel level, where the operating system resides and which is accessed by system calls. A system call to the kernel level performs specific tasks for an application or a user while ensuring that the application or the user does not perform a kernel level operation which is detrimental to the computer or processes operating within the computer. Within an operating computer, a unit of work is referred to as a process. A process is computer code and data in execution. The process may be actually executing or it may be ready to execute or it may be waiting for an event to occur. A process in a user mode operates at the user level. A process in a kernel mode operates at the kernel level. Some processes operate in the user mode and some processes operate in the kernel mode. When a process operating in the user mode makes a system call, the process operates in the kernel mode for the duration of the system call. Upon completion of the system call, the process returns to the user mode.
A wrapper function is computer code that is combined with other computer code to determine how the other code is executed. A wrapper function combined with a system call modifies execution of the system call and extends an operating system's capabilities to a level which would otherwise require modification of the operating system.
Execution of the wrapper function and its wrapped code begins with execution of the wrapper function and continues with execution of the wrapped code. In some situations, the wrapper function inserts additional code which executes before or after the wrapped code. In some other situations, the additional code executes partly before the wrapped code executes and partly after the wrapped code executes.
A wrapper function can be combined with a system call at the user level or at the kernel level. If the wrapper function is added at the user level, a malicious user level process could subvert the wrapper function. In contrast, a user level process cannot affect a wrapper function added in kernel mode. For UNIX and LINUX operating systems, a method of adding a wrapper function to a system call at the kernel level employs a loadable kernel module. A loadable kernel module is attachable to a standard operating system kernel without a need to modify the standard operating system kernel. The loadable kernel module can be added anytime up to run time.
The operating system capabilities obtainable by combining a wrapper function with a system call include security monitoring and checkpointing, restart, and migration techniques. Security monitoring is a technique for detecting unauthorized access to a computer.
Checkpointing is a technique employed on some computers where processes take significant time to execute. By occasionally performing a checkpoint of processes and resources assigned to processes, the processes can be restarted at an intermediate computational state in an event of a system failure. Migration is a technique in which running processes are checkpointed and then restarted on another computer. Migration allows some processes on a heavily used computer to be moved to a lightly used computer. Checkpointing, restart, and migration have been implemented in a number of ways.
Operating system checkpoint, restart, and migration has been implemented as an integral part of several research operating systems. However, such research operating systems are undesirable because they lack an installed base and, consequently, few applications exist for them. Application level checkpoint, restart, and migration in conjunction with standard operating systems has also been implemented. But these techniques require that processes not use some common operating system services because the checkpointing only takes place at the application level.
Object based checkpoint, restart, and migration have also been implemented. Such object based approaches use particular programming languages or middleware toolkits. The object based approaches require that the applications be written in one of the particular programming languages or that the applications make explicit use of one of the middleware toolkits. A virtual machine monitor approach can be used to implement checkpoint, restart, and migration. But such an approach requires checkpointing and restarting all processes within the virtual machine monitor. This approach also exhibits poor performance due to isolation of the virtual machine monitor from an underlying operating system.
In “The Design and Implementation of Zap: A System for Migrating Computing Enviroments,” Proc. OSDI 2002, Osman et al. teach a technique of adding a loadable kernel module to a standard operating system to provide checkpoint, restart, and migration of processes implemented by existing applications. The loadable kernel model divides the application level into process domains and provides virtualization of resources within each process domain. Such virtualization of resources includes virtual process identifiers and virtualized network addresses. Processes within one process domain are prevented from interacting with processes in another process domain using inter-process communication techniques. Instead, processes within different process domains interact using network communications and shared files set up for communication between different computers. The loadable kernel module adds a wrapper function to each system call in order to translate between virtual resources in a process domain (the user level) and corresponding resources at the kernel level.
Checkpointing in the technique taught by Osman et al. records the processes in a process domain as well as the state of the resources used by the processes. Because resources in the process domain are virtualized, restart or migration of a process domain includes restoring resource identifications to a virtualized identity that the resources had at the most recent checkpoint.
The loadable kernel module taught by Osman et al. creates process domains in which each process domain has its own virtual process identifiers. In some situations, the loadable kernel module could fail to assign a virtual process identifier to a process leading to a failure of the process domain.
What is needed is a method of assigning a virtual process identifier to a process within a process domain in which each process receives a virtual process identifier.