This invention relates to operating system software and, more particularly, to a method and apparatus for increasing the modularity of the operating system without substantially decreasing the efficiency or reliability of the data processing system.
This application is related to an application entitled "Apparatus and Method for Efficient Transfer of Data and Events Between Processes and Between Processes and Drivers in a Parallel, Fault Tolerant, Message Based Operating System," of Fishler and Clark (U.S. application Ser. No. 08/377,303), filed concurrently with this application, and which is herein incorporated by reference.
This application is filed with three Appendices, which are a part of the specification and are herein incorporated by reference. The Appendices are:
Appendix A: Descriptions of QIO library routines for a shared memory queueing system. PA1 Appendix B: A description of socket calls supported in a preferred embodiment of the invention. PA1 Appendix C: A list of QIO events occurring in a preferred embodiment of the present invention.
Conventional multiprocessor computers and massively parallel processing (MPP) computers include multiple CPUs, executing the same instructions or executing different instructions. In certain situations, data passed between the processors is copied when it is passed from one processor to another. In conventional fault tolerant computers, for example, data is backed up and checkpointed between the CPUs in furtherance of the goals of fault tolerance, linear expandability, and massive parallelism. Thus, in fault tolerant computers, data is duplicated between CPUs and if one CPU fails, processing can be continued on another CPU with minimal (or no) loss of data. Such duplication of data at the processor level is highly desirable when used to ensure the robustness of the system. Duplication of data, however, can also slow system performance.
In some conventional systems, data is transferred between software processes by a message system in which data is physically copied from one process and sent to the other process. This other process can either be executing on the same CPU or on a different CPU. The messaging system physically copies each message and sends each message one at a time to the receiving process.
When the copied data is used for purposes of checkpointing between processors, for example, it is desirable that the data be physically copied. At other times, however, the data is merely passed between processes to enable the processes to communicate with each other. In this case, there is no need to physically copy the data when the processes reside in the same CPU. At such times, it may take more time to copy and transmit the data between processes than it takes for the receiving process to actually process the data. When data is transferring between processes executing on the same CPU, it is not efficient to copy data sent between the processes.
Traditionally fault-tolerant computers have not allowed processes or CPUs to share memory under any circumstances. Memory shared between CPUs tends to be a "bottleneck" since one CPU may need to wait for another CPU to finish accessing the memory. In addition, if memory is shared between CPUs, and if one CPU fails, the other CPU cannot be assured of a non-corrupt memory space. Thus, conventionally, messages have been copied between processes in order to force strict data integrity at the process level.
On the other hand, passing data between processes by duplicating the data is time-consuming. To improve execution time, programmers tend to write larger processes that incorporate several functions, instead of breaking these functions up into more, smaller processes. By writing fewer, larger processes, programmers avoid the time-delays caused by copying data between processes. Large processes, however, are more difficult to write and maintain than smaller processes. What is needed is an alternate mechanism for passing data between processes in certain circumstances where duplication of data takes more time than the processing to be performed and where duplication of data is not critical for purposes of ensuring fault tolerance.