In information technology (IT) environments such as large engineering design systems, complex scientific applications, and multinational enterprises all require sharing of massive amounts of file data in a consistent, efficient, and reliable manner across a wide-area network (WAN). Accessing file data across a wide area network, a WAN data storage system needs to scale both in capacity and access bandwidth to support a large number of clients, and mask latency and intermittent connectivity for WAN access. While large clustered file systems can scale to peta bytes of storage and hundreds of GB/s of access bandwidth, such clustered file systems cannot mask the latency and fluctuating performance across a WAN. Replicating data closer to the client or the point of computation is one way to reduce WAN accesses times. However, replication is undesirable when data sets are large and access patterns are not known a priori.