In data archival and/or backup environments, there is often a need to store many data objects such within an archival/backup system. As the total stored volume increases, the performance within the archival/backup system can decease markedly.
In some data archival and/or backup systems, such as conventional content addressable storage systems, a volunteer-based lookup system is used. In such a system, a lookup for storage and/or retrieval uses a broadcast message to indicate that a particular file needs storing. The broadcast message is sent to all storage nodes, and a node replies (“volunteers”) to indicate that it will store that file. For restorage (or retrieval), a broadcast message identifies the stored data and the node having that data replies (“volunteers”) to indicate that it has stored the original data and thus receives the data for restorage (or returns the stored data in a retrieval situation). As the data volume increases, the number of broadcasts increase to the point where the majority of the bandwidth of the system can be used up by the broadcasts with very little remaining for actual data transfer to or from storage. In some known systems, the slowdown becomes particularly marked once the total number of stored files reaches approximately 50 million.
Conventional systems typically use a storage controller to manage the broadcast requests to the storage nodes. In some systems, the controller receives a data file for storage from a storage agent and calculates an identifier for the file before broadcasting the identifier to all storage nodes. In other systems, the agent provides the identifier to the controller for the controller to broadcast to the storage nodes. Dependent upon the result of the broadcast system, the controller then causes the file to be stored to a storage node. In both of these systems, the controller is a bottleneck in the storage system and can easily have its entire capacity taken up with broadcasting requests, thereby severely slowing the rate of actual data storage and/or retrieval.
The present invention has been made, at least in part, in consideration of drawbacks and limitations of such conventional systems.