In a distributed file storage system, file data can be copied or replicated on multiple servers in order to make that data available even in the event of one or more server failures. It is important that each client that wishes to access that data knows where the latest version of that data can be found.
However, the location of each copy of the file is dynamic and thus can change over time. For example, a distributed file storage system is bound to experience failures or maintenance issues that result in one or more servers being taken offline. Generally speaking, there will be occasions in which a server is inaccessible to clients.
In the event a server becomes unavailable, the file storage system will invoke a failover protocol that results in the data that was on that server being replicated on a different server. For example, if a file on server A is replicated on server B, and server B becomes unavailable for some reason, the file on server A is again replicated on server C (alternatively, the file on server B can perhaps be migrated to server C). In any event, the location of the replica has changed.
Thus, the lists of servers that store the replicas must be updated as new replicas are created or as the locations of the replicas are changed. Conventional approaches for updating such lists can be problematic. For instance, a brute force search in which all servers in the system are contacted and polled can be performed. However, such an approach is inefficient.
Also, conventional approaches for maintaining updated copies of such lists can be problematic. For instance, it is necessary to make sure that each client has an up-to-date list for the files that client wishes to access. A client with an out-of-date or stale list may access invalid data or may not be able to locate valid data.
Clients can also go offline at unpredictable times for unpredictable intervals, exacerbating the problem of keeping their server lists up-to-date. For example, clients 1 and 2 may both access a file that is replicated on servers A and B. On Monday, client 1 may be taken offline; on Tuesday, servers A and B may be taken offline and replaced with servers C and D. Client 2 can be updated with the new location information, but the server list on client 1 will be stale. On Wednesday, client 2 may be offline, and client 1 may return to service. Therefore, client 1 cannot retrieve the updated server list from client 2 but still needs to obtain an updated list in some manner. Many similar types of scenarios are possible.
In summary, the combination of multiple copies of data (some of those copies valid, and some of those copies invalid or out-of-date), multiple storage locations, changing server availability, and changing client availability makes it difficult to efficiently identify, distribute, and maintain up-to-date location information for the most recent version of each instance of replicated data.