The present invention relates to techniques for optimizing the storage of a large number of objects over different types of storage and, in particular, to storing and migrating objects among the different storage options according to an importance measure associated with each.
Increasingly vast amounts of data are being stored by Internet services providers and enterprises. For example, providers of email services must provide huge amounts of storage as part of basic (often free) accounts. The economic burden associated with this obligation is further exacerbated by the ever-increasing volume of spam as well as by users using such storage to back up their data.
From the user's perspective, an important property of mail servers is that saved emails are perceived to be on-line, i.e., instantly available for reading. However, after a certain amount of time has passed, most emails are only rarely retrieved, and the majority are never retrieved. A current storage solution used by service providers to mitigate storage costs takes advantage of this fact. According to this approach, emails in the system are initially stored in more expensive memory, i.e., memory which provides users the “on-line” access they expect. After a certain period of time has passed, e.g., days or weeks, the emails are automatically stored in less expensive memory characterized, for example, by a longer access time. This migration is typically controlled by a hard-coded business rule.
While such an approach may mitigate the costs of mass storage for service providers, it is still a relatively crude approach which does not accurately reflect the usage patterns of particular individuals and objects. Therefore, there remains a need for techniques which more effectively balance the mitigation of storage costs with the expectation of users that their data will be readily accessible.