The present application relates generally to an improved data processing apparatus and method and more specifically to a mechanism for adjusting the location of data in a tiered storage system based on an examination of data usage patterns.
Traditional storage models recognize two separate types of storage devices: online storage devices and offline storage devices. Online storage devices typically store transactional data requiring high availability, instant access, and steadfast reliability. Offline storage devices typically store archival data that is infrequently accessed and is stored for long periods of time. However, in the modern environment, data use has expanded beyond simple transactional and archival use. Thus, the concept of tiered storage systems has been introduced.
The concept of tiered storage is based on the varying performance of storage devices as well as the varying demands on performance of these storage devices from the various workloads encountered. Tiered storage involves having multiple logical and physical levels of storage devices based on the performance capabilities and costs of the storage devices and then storing data in these various levels of storage devices based on the expected demand for that data and the corresponding performance of the storage devices in that level of the tiered storage system.
Thus, for example, at a highest level of the tiered storage system, a plurality of storage devices having very high performance capabilities is provided. These storage devices are utilized in the tiered storage system with data that is expected to be required frequently and with minimal access delay. This tier of the tiered storage system is sometimes referred to as the “online” tier or T0. This tier will usually consist of storage devices which are the most expensive to manufacture and purchase.
A middle tier of the tiered storage system, sometimes referred to as the “nearline” tier or T1, has storage devices that have a lower performance capability than the highest level of the tiered storage system but still have sufficient performance to handle accesses to data that are accessed on a regular basis but not as often as the data stored in the highest tier or whose access can tolerate larger access delays due to lower performance measures of the storage devices in this middle tier of the tiered storage system. There may be multiple middle tiers in a tiered storage system based on the complexity of the tiered storage system and the differing performance capabilities of the storage devices employed.
A bottom tier of the tiered storage system, sometimes referred to as the “offline” tier, may be comprised of relatively low performance storage devices. This tier is often used to archive data or store data that is infrequently accessed and thus, the access delay associated with these storage devices is not of a concern.
The reason to implement such tiered storage systems is not only based on the various demands for storage device performance by the workloads in today's computing environments, but also on the cost of such storage devices. Costs of storage devices are proportional to the performance of the storage device. That is, higher performance storage devices cost considerably more than lower performance storage devices. As a result, it is less costly to have a large number of lower performance storage devices than to have a large number of high performance storage devices. As a result, in a tiered storage system, a relatively smaller set of high performance storage devices may be used to handle data requiring high availability and instant access. Meanwhile, a relatively larger set of lower performance storage devices may be used to store data for archival purposes or for infrequently accessed data. A middle sized set of intermediately performing storage devices can be used to handle data requiring regular access. As a result, the cost of the storage system may be minimized while still accommodating the workload demands using a tiered approach.