Distributed systems allow multiple clients in a network to access a pool of shared resources. For example, a distributed storage system allows a cluster of host computers to aggregate local storage devices (e.g., SSD, PCI-based flash storage, SATA, or SAS magnetic disks) located in or attached to each host computer to create a single and shared pool of storage. This pool of storage (sometimes referred to herein as a “datastore” or “store”) is accessible by all host computers in the cluster and may be presented as a single namespace of storage entities (such as a hierarchical file system namespace in the case of files, a flat namespace of unique identifiers in the case of objects, etc.). Storage clients in turn, such as virtual machines spawned on the host computers may use the datastore, for example, to store virtual disks that are accessed by the virtual machines during their operation. Because the shared local storage devices that make up the datastore may have different performance characteristics (e.g., capacity, input/output per second or IOPS capabilities, etc.), usage of such shared local storage devices to store virtual disks or portions thereof may be distributed among the virtual machines based on the needs of each given virtual machine. Accordingly, in some cases, a virtual disk may be partitioned into different data chunks or stripes that may be distributed among local storage resources of hosts in the datastore, where each data stripe may then be stored by local storage resources of different hosts in the datastore.
In some cases, to increase storage efficiency of distributed storage systems, a technology referred to as data deduplication may also be applied. Data deduplication scans data blocks and stores only unique data blocks in the datastore. In addition to data deduplication, in some cases, data compression may be used to reduce the size of data blocks by removing redundant data within the data blocks. Accordingly, in some cases, deduplication and compression may be used together to help remove content-level redundancy and increase storage efficiency. Distributed storage systems with data deduplication and compression support, in some cases, may need to carry out various data movement tasks to meet high level objectives such as capacity based load balancing, host or disk decommissioning, etc. However, in a distributed storage system with data deduplication and compression functionalities, moving data blindly from the local storage resources of one host to the local storage resources of another host may negatively affect overall performance of the distributed storage system.