The present disclosure generally relates to data management. The disclosed embodiments relate more specifically to a system, apparatus, method, and computer program product for optimizing the placement of data utilizing cloud-based information technology (IT) services based on the characteristics of the data and the cloud-based IT services.
The volume of data generated by individuals, enterprises, and organizations is growing at a frenetic pace. Traditionally, such entities have purchased more local storage as the volume of data that needed to be stored increased. Such a response to increased storage needs was possible because the cost of storage devices was dropping almost as fast as the size of storage needs were increasing. The cost of such data storage, for example, was the amortized cost of the corresponding storage device.
More recently, however, entities have begun relying on cloud IT services for their data storage needs. Although such services eliminate the need to purchase and maintain storage devices locally, cloud IT service providers typically charge their tenants on a month-by-month basis. Accordingly, rather than being able to amortize the costs of data storage over a finite period of time, those costs will continue perpetually as long as the need for storage persists.
In addition, the growth of data generated by individuals, enterprises, and organizations has become so large and complex that it has become difficult to process that data utilizing conventional database management tools and data processing applications. That trend is referred to colloquially as “big data.” And the size and complexity of such large data sets makes it difficult to effectively and efficiently manage that data. For example, it is difficult for entities to determine which data may be deleted because it is either duplicate or obsolete data.
Accordingly, entities' storage needs are likely to continue to increase as those entities continue to generate more data. It also follows that the costs associated with storing that data on a cloud IT service will increase as those entities continue to generate more data. Those increased costs are particularly problematic when considered in view of their perpetual, month-by-month nature.