Conventionally, all data to be de-duplicated may have been treated the same. To the extent that there has been any scheduling associated with de-duplication, that scheduling may have been simple first-in first-out (FIFO) scheduling where the first item identified for de-duplication is the first item de-duplicated. However, not all data to be de-duplicated may be equal. For example, an organization (e.g., enterprise, business, university) may have two types of data: mission critical data that is to be replicated and mission-useful data that may not be replicated. These two types of data may be distributed in various locations in an organization and stored on different storage devices (e.g., tapes, disk drives) residing at various levels of different networks.
The organization may consider their business to be secure if and when their mission critical data is replicated. Therefore, to enhance business security, the organization may desire to have their mission critical data replicated as soon as possible, or at least before the mission useful data. But this desire may be frustrated because a data replicating application or apparatus may first require that data be de-duplicated before it can be replicated. Yet conventional de-duplication has no way to distinguish one type of data from another type of data and therefore no way to prioritize for de-duplication one type of data (e.g., data to be replicated) over another type of data (e.g., data that will not be replicated).
The foregoing statements are not intended to constitute an admission that any patent, publication or other information referred to herein is prior art with respect to this disclosure. Rather, these statements serve to present a general discussion of technology and associated issues in the technology.