Despite decades of research and development efforts for, and the resulting improvement in, software reliability, software defects still account for many system failures. Moreover, methods for retroactively dealing with software defects, such as software patches, can often introduce latent and/or new defects.
The problem is further exacerbated by the burgeoning popularity of service oriented computing (SOC) systems, such as online commerce, e-mail, Internet Protocol (IP) telephony, and grid computing, and the availability requirements accompanying such systems. In order to update SOC systems to fix existing software defects and vulnerabilities, system administrators need to strike a careful balance between bringing a system down for installing updates and keeping the system available for processing service requests.
Unfortunately, most SOC systems are long-running servers that amass considerable operational state data and, therefore, the option of launching another machine to test the effects of software patch(es) is limited.
System administrators can test software patches on a non-production machine and mirror the resulting traffic to the corresponding production system. Because the machines are disjoint, however, there are some drawbacks. First, comparison of two or more mirrored machines requires some level of cross-system synchronization, which, in turn, adds considerable overhead. Second, when dealing with encrypted traffic, a proxy would need to be employed, thereby adding complexity and causing performance degradation.