Large-scale networked systems are commonplace systems employed in a variety of settings for running applications and maintaining data for business and operational functions. For instance, a datacenter may provide a variety of web applications (e.g., email services, search engine services, etc.). Large-scale networked systems include a large number of server nodes, in which each node is a physical machine or a virtual machine running on a physical host. Due partly to the large number of server nodes that may be included within such large-scale systems, deployment of software (both operating systems (OSs) and applications) to the various nodes and maintenance of the software on each node can be a time-consuming and costly process. In particular, software is traditionally installed and upgraded locally on each node such that installation and updates are specific to the individual nodes. A number of failures can occur that are only detected during an on-line provisioning and/or update process. Additionally, “bit rot” can occur when a machine is serially upgraded and patched locally. Bit rot refers to changes to a local software state (e.g., OS configuration state) that may occur due to human or software errors. When this state changes, the behavior of the node may become unpredictable.