A distributed storage system has multiple servers. Each server may be a single computer, part of a computer (e.g., a partition on the computer's attached storage), several computers cooperating, or some combination of these (e.g., a collection of three distinct partitions on three distinct computers might constitute a single server).
Data centers and web services, such as mail services, use a distributed storage system. A large number of data items, such as data pertaining to users of a data center or a mail service, may be stored on the servers or other storage nodes in the distributed storage system. The allocation of users and their associated data should be distributed across the storage system in order to optimize performance, maximize capacity, ensure reliability (through replication), and in general satisfy the policies of the data center. This is difficult because disks used for storage are organized in a distributed way and are heterogeneous in terms of capacity and performance characteristics. Also, users differ in their behaviors and usage of the service, such as likely hours of operation, capacity utilization, etc. Finally, optimal allocation of users and their associated data is also difficult because users may be added or removed from a data center or web service and users may change their patterns of utilization, for example. Other requirements on allocations of users, such as allocations based on storages near the users, are similarly difficult.