During the past few years, the family of computer operating systems known as UNIX has come into wide use in commercial installations. The number of UNIX systems in operation is said to number in the tens-of-thousands and growing. As with other successful computer operating systems, UNIX has been developed in a series of releases, and the evolution continues.
The UNIX operating system was designed for use in a multi-tasking, software development operation. It was developed to execute on minicomputers, and variations of it have been used on other types of machines, most notably on microcomputers such as the IBM Personal Computer. This has been possible because the system is written in the machine independent language "C", which facilitates the conversion of UNIX from one machine to another.
The UNIX operating system has basically two states of operation, the operating system, or kernel, state and the user state. Lower level file management, driver activity and process management occur in the kernel state. Processes normally run in the user state. In the UNIX system a user executes programs in an environment called a user process. When a system function is required, the user process calls the system as a subroutine. At some point in this call, there is a distinct switch of environments. After this, the process is said to be a system process. In the normal definition of processes, the user and system processes are different phases of the same process (they never execute simultaneously). Each time a kernel operating system request is called by a process, state transition changes the user state to the kernel state. When the request has been satisfied, the state is changed (returned) to the user state.
The UNIX system has been ported to larger data systems such as the IBM System/370 computers and other computers performing equivalent functions. In such applications, the UNIX system does not run directly on IBM hardware, but includes a .two-level. system in a guest-host relationship. The upper level consists of user application code, UNIX system code in the kernel and resident supervisor code. Taken together, the system code in the kernel and resident supervisor code comprise, a system nucleus. The lower level consists of a host operating system already executing on IBM System/370; e.g., the IBM Virtual Machine/System system calls as well as the file system structure. The UNIX resident supervisor allocates system resources and handles all machine-dependent I/0 operations, memory management (including paging), and process scheduling in a virtual environment. The interface between the two layers consists of system interrupts, also known as supervisor calls (SVCs), from the UNIX system to the resident supervisor, and pseudo-interrupts from the resident supervisor to the UNIX system.
The major advantage of this approach is that the UNIX system does not have to concern itself with hardware architecture. One disadvantage is that a performance penalty is paid in communication between the two layers. Also, system algorithms employed in the resident supervisor are not necessarily optimal for the UNIX operating system.
One design of the UNIX operating system for IBM's System/370 computers has been described in the paper by Felton et al entitled "A UNIX System Implementation for System/370", AT&T Bell Laboratories Technical Journal, Vol. 63, No. 8, Oct. 1984. This paper describes the UNIX operating system using the TSS/370 operating system as the interface between UNIX and IBM System/370.
Another version of UNIX which operates on the IBM System/370 is called the IBM Interactive Executive for System/370 (IX/370). This version of UNIX was made commercially available by the IBM Corporation in 1985. The IBM operating system VM/SP is used as an interface between the IX/370 version of UNIX and the IBM System/370 hardware to provide various functions such as input/out (I/O), paging, error recording and recovery, which cannot be performed by UNIX or IX/370. The VM/SP operating system provides a well-structured system which provides an elegant interface for UNIX system processes. In such a configuration IX/370 is referred to as the guest operating system (GOS) and VM/SP as the host operating system (HOS).
Like the UNIX operating system running as a GOS, IX/370 includes three types of programs running in two different software levels. One type comprises user programs such as user-written programs and system-provided programs, including the UNIX shell. The second type is the system supervisor code which incorporates much of the function and C-language code of the standard UNIX system kernel. The resident supervisor, the third type, supports the multi-programming of system processes, provides low level system calls and manages the physical system configuration. Each IX/370 system process executes within its own virtual memory. The resident supervisor controls the resources allocated, including process scheduling, dispatching and real storage management.
User programs and the system supervisor of IX/370 share the same process space. The system supervisor is located in the upper eight megabytes of this space, and the user programs are located in the lower eight megabytes. "Page 0" (the Interrupt Storage Area), the lowest 4096 bytes of the process space, is reserved for Program Status Words (interrupt vectors) and other information associated with the process virtual machine. Page 0 is the communications page between IX/370 and the control program (CP) of VM/SP.
A program at one level communicates with the next lower level through system calls which are of two kinds: UNIX system calls and resident supervisor system calls. The latter are used by the UNIX kernel to request certain lower-level functions of the resident supervisor.
In the initial implementation of the IX/370 version of UNIX, its execution as a guest operating system (GOS) under VM/SP as a host operating system (HOS) causes a significant performance problem, even though the inherent characteristics of UNIX and its algorithms allow it to be implemented quite satisfactorily on large machines. In particular, two problem areas relate to the UNIX user-to-kernel linkage and the UNIX kernel-to-user return linkage, which are executed in the HOS. The problem occurs in any implement at a version of UNIX running as a GOS with IBM System/370 UM/SP functionally equivalent HOS. In fact, the problem is generic in that it would occur in any guest GOS running in a HOS where operating system services are readily available from the GOS for use by user processes, e.g., application programs. In such systems, state transitions are performed frequently.
Within the prior art user-to-kernel IX/370 linkage path, eleven privileged instructions are executed by the resident supervisor every time a state transition occurs between user state and kernel state. A privileged instruction can be executed only if the system is running in supervisor state, which is indicated by a bit in the Program Status Word (PSW) of the S/370 hardware. (Instructions are classified as privileged for system integrity purposes. Improper usage of such instructions by application programs, if they were not so classified, could cause profound system damage.)
Reference is made to the books entitled "IBM System/370 Principles of Operation", Manual No. GA22-7000-9 and "Virtual Machine/System Product System Programmer's Guide", Manual No. SC24-5272-1, for descriptions of the terms used herein and their operation. These Manuals are available through an IBM Branch Office.
Examples of the S/370 privileged instructions in the linkage path are LCTL, STCTL, LRA, and STPT. When these instructions are executed in a GOS environment, the performance and scheduling impacts can be substantial. Because the GOS is executing in problem state, each privileged instruction executed by the GOS creates a hardware interrupt to the VM HOS which must simulate the privileged instruction for the GOS. This interruption and simulation can cause many thousands of instructions to be executed due to the actual simulation and scheduling because of the interrupt. In the case of the prior art IX/370, these eleven privileged instructions are located in a critical section of system code that is executed quite often.
Addressing these prior art problem areas more 30 specifically with respect to IX/370, the IX/370 user programs request services from the kernel through IX/370 system calls which cause a state change from the user mode to the kernel mode. This is done by executing a supervisor call (SVC), specifically SVC 10. The frequency of these requests is typically. quite high. IX/370, like other the time that a process is executing in the kernel and in the user state. Because this time accounting is required by the basic design of the UNIX operating system itself, it is not possible to reflect the user-to-kernel SVC from the user directly to the IX/370 kernel. The control program (CP) routine in the HOS which processes the SVCs issued by the user program, termed the SVC handler, must do the necessary timer calculations and store the different times so that the timer and accounting functions. of IX/370 between the SVC issued by the process for kernel services and the actual routine to perform these services is approximately 150 instructions. Moreover, the prior art return, i.e., the kernel to the IX/370 user, is done via SVC 76 and is also accomplished in approximately 150 instructions.
The performance problem when running under VM/SP comes from the additional path length overhead introduced by the privileged operations (PRIV OP) in these linkage paths. FIG. 1 illustrates the prior art user-to-kernel linkage. A SVC 10 call from the GOS (IX/370) user requesting kernel services causes an interruption condition, which results in a change in the program status word (PSW) in System/370. The current PSW is then stored at the SVC old-PSW location, and a new PSW is fetched from the SVC new-PSW location, becoming the current PSW. Control is then passed via a Load Program Status Word (LPSW) instruction to the GOS kernel. In this process, the GOS resident supervisor executes state transition code to perform timer calculations and the state change, involving around 150 instructions as discussed previously. Within the instructions there are privileged instructions (illustrated in FIG. 1 as PRV OP) and each such privileged instruction is intercepted and simulated on behalf of the GOS by the HOS. Each privileged instruction simulation results in many instructions which must be executed for the simulation and causes the process to be redispatched by the HOS to the GOS resident supervisor. Each such redispatching step itself takes many instructions for performing the privileged OP code simulation and in the GOS schedule contained in the HOS control program code.
The return from the GOS kernel to the GOS user, via SVC 76, involves the same process at illustrated in FIG. 2. (SVC 76 is also used in the prior art for kernel-to-kernel state changes as well as kernel-to-user state changes.)
The resident supervisor in IX/370 illustrated in FIGS. 1 and 2 is the IBM TSS/370 resident supervisor described in the above-referenced article by Felton et al. Further description of the operation and function of TSS/370 is found in the books entitled "Time Sharing System, System Logic Summary" Manual No. GY28-2009 and "IBM Time Sharing System - System Programmer's Guide", Manual No. GC28-2008. These Manuals are available through an IBM Branch Office.