Copy-on-write (referred to herein as “COW”) is a technique that is used in computer programming, which enables a point-in-time logical view of data. For example, with COW, multiple processes can share data (e.g., an image, snapshot, etc.) which is stored in memory or disk storage, wherein each process can read and utilize a local copy of data. When a given process needs to modify the data, however, then a separate local copy of that data will be generated on which the process can perform modifications, while the original shared copy of data remains unchanged. As such, COW is a policy that whenever a process attempts to modify shared data, the process will first create a separate (private) copy of that information to prevent the changes from becoming visible to the other processes.
COW is widely used in various applications such as Docker image backing stores, and applications that make use of VM (virtual machine) snapshots (e.g., copy of a virtual machine's disk file a given point in time), and array snapshots. In general, there are two types of COW implementations—one implementation which copies original data to a new location and another implementation which does not copy original data to a new location. Irrespective of the implementation, COW essentially divides a dataset into layers (e.g., deltas or snapshots) for storage efficiency, as each layer only maintains a portion of the data (e.g., the modified portion of the data). On the other hand, COW creates a cross-layer data implicit dependency such that if data is not found on a local layer, a query is made to a parent layer for the data, and this traversal process continues through the layers until the target data is found. This cross-layer dependency and traversal mechanism can lead to degraded performance resulting from, e.g., traversal lookup overhead, disk I/O amplification, and/or data memory amplification.