The embodiments of the present invention relate to managing data storage within a computing environment based upon collaborative activities. Data management can refer to processes such as replication and archival of data. Replication generally refers to the act of selecting a file located within a data storage device and creating a copy of that file within one or more other data storage devices. Replication allows files considered to be important to be copied to alternate data storage devices. When replicating a data set, e.g., a plurality of files, often only a subset of the plurality of files, considered to be of greater importance, is replicated. The copied subset of files can be referred to as a “partial replica.”
Archival refers to the process of selecting a file that is located within a data storage device and creating a copy of the file within another data storage device. Once copied, the file can be removed from the original data storage device. Files that are archived are only available from the data storage devices to which those files are archived. Unlike replication, archived files are no longer accessible from the original data storage device. Typically, the archival data storage device is less accessible to users. That is, users must follow more involved procedures to retrieve desired data. An archival data storage device is not as accessible as a local data storage device that is intended for everyday use by a user.
Within conventional systems, the decision to archive data is largely driven by the age of the files. For example, files that are “older” than a specified age may be selected for archival. Replication also may select files according to age, but replicate only newer files, for example. In effect, conventional data management systems make an assumption that the age of a given file is determinative of the importance or relevance of that file to a particular user.