Operating system platforms have enabled the rapid growth of various technologies that are developed on such systems. Many of these platforms along with running many differing applications for developing the technologies also have become much easier to use when adding components such as hardware devices and associated drivers to the systems. For instance, in one area, some systems allow hardware or software components to be installed on the respective platforms, whereby these components in essence can be plugged into the system with a high degree of confidence that they will also cooperate with the system and other devices/components that have been previously installed. One common name for such technology is referred to as Plug and Play technology which enables devices or components to be easily integrated within an existing system.
Plug and Play technology generally relates to when a computer system automatically recognizes new devices and determines what driver software, resource settings, and so forth the device needs with very little or no interaction from the user. This technology also will typically only load a driver if it is needed since the hardware is currently detected as present. A driver is a software component that resides in between the operating system and the hardware and allows the operating system to communicate with the hardware. In some operating systems, “drivers” are software modules that can be inserted into an operating system kernel, allowing for support of specific hardware or for extension of the operating system or both. Generally, drivers run in a fully trusted mode, whereby any failure in these components can cause machine services to fail, or a full system crash. Thus, any successful effort to make drivers more resilient or fault tolerant usually causes greater system reliability and consequently customer satisfaction to increase.
One of the barriers to greater driver resilience is that a driver typically has to respond to many “events” generated by the operating system which may require the driver to initiate operations which can fail. For example, these events may be file handle creation, device insertion, power being turned off, statistics gathering, and so forth. Most of the time, the exact action that a driver should take in response to an internal failure is poorly defined. This is partly due to the operating system not always being designed to handle every conceivable set of failures, partly due to external documentation not covering every situation and partly due to certain failures that involve a large amount of judgment on the part of the driver designer. Furthermore, drivers are often constructed internally as large “state machines” wherein a response to an event can depend largely on which events have occurred in the past. After a failure occurs, the driver designer often has to immediately turn around and handle new events, even though the failure probably implies that new events are likely to fail as well.
In many situations, when bugs are discovered (or new features added) in drivers and their associated operating system components, software revisions are routinely sent out overtime to correct such problems. In many cases, the revisions can be downloaded over the Internet with a few “mouse clicks.” Currently, the drivers and operating system components exist in a tightly-coupled relationship. In other words, if a bug were discovered in the operating system components with no fault to the driver, and a revision were to be required of the operating system, then in many cases, both the operating system and the driver would need to be updated. This type of arrangement is highly inefficient however. While the drivers may be perfectly operational and bug free, they should not have to necessarily be upgraded merely due to problems that have been detected in other portions of the system. Conversely, if a particular driver problem is discovered, new revisions propagated to the driver should have minimal impact on the rest of the system.