1. Field of the Invention
This invention relates to computer networks and, more particularly, to storage management for a distributed data storage network.
2. Description of the Related Art
As computer systems continue to produce and utilize ever-increasing amounts of data, the task of properly storing the data becomes increasingly important. In particular, the high degree of networking among computer systems and the need to support distributed applications has led to the development of the distributed data storage networks that are in use today.
Distributed data storage networks typically include a plurality of networked computer systems, where each computer system stores data for use by an organization or application. One benefit commonly provided by distributed data storage networks is data replication. Copies of particular portions of data, e.g., copies of files, may be stored on multiple computer systems in the distributed data storage network. Such data replication may enable faster retrieval of the data because the data can be retrieved from the computer system that is closest or fastest. Data replication may also result in increased available network bandwidth by reducing the need to forward data requests and data transfers throughout the network. Data replication may also increase the fault tolerance of an application, since if one computer system fails, the necessary data can still be obtained from another computer system that is still operational.
Some distributed data storage networks also employ the concept of storage fragmentation, where a unit of data is fragmented into multiple parts that are each stored on separate computer systems. The fragmented nature of the data may be transparent to the user and to client applications, who remain unaware of the details of how and where the data is stored.
As distributed data storage networks have become larger and more complex, the issue of storage management has become a great challenge. Storage management for a distributed data storage network includes issues such as controlling the level of data replication (e.g., the number of computer systems on which each portion of data is replicated), controlling the manner in which data is distributed among the computer systems (e.g., controlling the percentage of storage utilized on each computer system), etc.
Techniques for automatically controlling storage management issues for a distributed data storage network have been utilized in the prior art. However, prior approaches have typically involved configuring individual computer systems in the distributed data storage network to respond to statically configured policy rules (typically based on resource thresholds). It would be desirable to provide a system that instead responds to changes in its environment incrementally to move the system progressively to more optimal states as defined by one or more system-wide storage goals.
It may also be desirable to provide a de-centralized storage management solution. For example, it may be desirable to implement the distributed data storage network as a peer-to-peer network in which each node performs roughly equivalent functionality and does not rely on centralized servers. It may be desirable for the storage management solution to leverage resources available throughout the network to achieve a storage goal for the system as a whole.