An operating system is a computer program that is the first piece of software that a computer executes when a computing device is turned on. Operating systems are no longer limited to mainframe and desktop computers: mobile telephones and various other portable devices now use operating systems to enable and manage multiple different functionalities of their host device, so the term computer system is not limited to a traditional mainframe/desktop device. In certain networked arrangements, one operating system may manage and be accessed by multiple user stations simultaneously, which is considered within the definition of a single computer system. The operating system loads itself into memory and begins managing the resources available on the computer. It then provides those resources to other applications that the user wants to execute. Typical services that an operating system provides include a task scheduler, a memory manager, a disk manager, a network manager, other I/O services manager, and a security manager. These services are exemplary only. The core operating system functions, the management of the computer system, lie in what is termed the kernel of the operating system in a traditional computer architecture. In a micro-kernel architecture, core operating system functionality can lie on system servers, outside the kernel. The kernel is often considered as how the operating system is displayed to the user of a device, but in fact the kernel lies below the display manager (though is often tightly tied to it).
At the simplest level, an operating system manages the hardware and software resources of the device or system (e.g., processor, memory, disk space, etc.), and it provides a stable, consistent way for applications to deal with the hardware without having to know all the hardware details. The first task, managing the hardware and software resources, is very important, as various programs and input methods compete for the attention of the central processing unit (CPU) and demand memory, storage and input/output (I/O) bandwidth for their own purposes. In this capacity, the operating system ensures that each application gets the necessary resources and ensures proper interfacing between applications, as well as husbanding the limited capacity of the system for maximum usage by the various users and applications. The second task, providing a consistent application interface, is especially important if there is to be more than one of a particular type of computer using the operating system (e.g., a single operating system for computer systems made by different manufacturers), or if the computer system hardware is changed or updated. A consistent application program interface (API) allows developers of application software to write code on one computer system and have a high level of confidence that it will run on another computer system using the same operating system, even if the amount of memory, the quantity of storage, or even the computing architecture is different on the two computer systems.
As computing infrastructure becomes more widespread, updates to various software programs have become more common. These are generically termed patches, and there have been an increasing number of patches for functionality, performance, and especially security reasons. To take effect, these patches traditionally require either restarting system services, or often rebooting the operating system, resulting in downtime. Sometimes this downtime can be scheduled, if for example the patch adds a feature, improves performance, etc. In other situations such as applying a security patch, delaying the update is not desirable. Users and system administrators are forced to trade off the increased vulnerability of a security flaw against the cost of unplanned downtime. Dynamic update is used to avoid such downtime, and involves on-the-fly application of software updates to a running system without loss of service.
In addition to the above-mentioned impact on availability, dynamically updatable operating systems have other benefits. Such operating systems provide a good prototyping environment. They allow, for example, a new page replacement, file system, or network policy to be tested without rebooting. Further, in more mature computer systems such as mainframes, some user constraints prevent the operating system from ever being shutdown. In such an environment, users can only get new functionality into the operating system by performing a dynamic update.
An operating system is a unique environment with special constraints as compared to an application, and additional challenges must be solved to provide dynamic update functionality. An important piece of prior art is described in U.S. Patent Application Publication No. 2005/0071811 A1 to Appavoo et al. (hereinafter, the Appavoo publication). The Appavoo publication discloses how to swap an individual object instance and discusses other prior art references detailed below. The Appavoo publication is hereby incorporated by reference, and it is noted that each inventor of the Appavoo publication is an inventor of this application.
One reference detailed in the Appavoo publication is entitled: “Dynamic C++ Classes: A Lightweight Mechanism to Update Code in a Running Program,” by Gisli Hjalmtysson and Robert Gray (Annual USENIX Technical Conference, June 1998, pps 65-76, USENIX Association). The Hjalmtysson and Gray reference describes a mechanism for updating C++ objects in a running program, but, in the disclosed system, client objects need to be able to recover from broken bindings due to an object swap and retry the operation, so the mechanism is not transparent to client objects. Moreover, the Hjalmtysson and Gray approach does not detect quiescent state, and old objects continue to service prior calls while the new object begins to service new calls.
Another reference detailed in the Appavoo publication is entitled: “Optimistic Incremental Specialization: Streamlining a Commercial Operating System,” by Calton Pu, Tito Autrey, Andrew Black, Charles Consel, Crispin Cowan, Jon Inouye, Lakshmi Kethana, Jonathan Walpole and Ke Zhang (ACM Symposium on Operating System Principles, Copper Mountain Resort, Colo., Dec. 3-6, 1995, Operating Systems Review, vol 29, no 5). This reference describe a replugging mechanism for incremental and optimistic specialization, but the reference assumes there can be at most one thread executing in a swappable module at a time. In later work, that constraint is relaxed but does not scale.
As mentioned, the Appavoo publication and the works discussed therein describe how to hot-swap an individual object. For a true dynamic upgrade, all objects of a given class need to be swapped.
Another reference, entitled “Mutatis Mutandis: Safe and Predictable Dynamic Software Updating” by G. Stoyle, M. Hicks, G. Bierman, P. Sewell and I. Neamtiu (POPL '05, Jan. 12-14, 2005, Long Beach, Calif.) describes a formal model for dynamic update in C-like languages using pre-computed safe update points present in the code.
Some commercial operating systems offer features similar to Sun® Microsystems' Solaris Live Upgrade, which allows changes to be made and tested without affecting the running system, but requires a reboot for changes to take effect. Other approaches are limited to perform an upgrade on a single threaded user-space applications cite.
What is needed is a dynamic upgrade approach that is scalable for both upgraded objects and new objects, for both single CPU computer systems and those with hypervisors and multiple instances of operating systems, that has the capability to track objects and to dynamically upgrade all objects of a particular type in a running operating system, without the need to shut down or reboot that operating system.