This disclosure relates to data processing and data storage, and more specifically, to intelligent allocation of heat-tiered storage among throttling units.
In general, cloud computing refers to a computational model in which processing, storage, and network resources, software, and data are accessible to remote host systems, where the details of the underlying information technology (IT) infrastructure providing such resources is transparent to consumers of cloud services. Cloud computing is facilitated by ease-of-access to remote computing websites (e.g., via the Internet or a private corporate network) and frequently takes the form of web-based resources, tools or applications that a cloud consumer can access and use through a web browser, as if the resources, tools or applications were a local program installed on a computer system of the cloud consumer. Commercial cloud implementations are generally expected to meet quality of service (QoS) requirements of cloud consumers, which may be specified in service level agreements (SLAs). In a typical cloud implementation, cloud consumers consume computational resources as a service and pay only for the resources used.
Adoption of cloud computing has been facilitated by the widespread utilization of virtualization, which is the creation of virtual (rather than actual) versions of computing resources, e.g., an operating system, a server, a storage device, network resources, etc. For example, a virtual machine (VM), also referred to as a logical partition (LPAR), is a software implementation of a physical machine (e.g., a computer system) that executes instructions like a physical machine. VMs can be categorized as system VMs or process VMs. A system VM provides a complete system platform that supports the execution of a complete operating system (OS), such as Windows, Linux, Android, etc., as well as its associated applications. A process VM, on the other hand, is usually designed to run a single program and support a single process. In either case, any application software running on the VM is limited to the resources and abstractions provided by that VM. Consequently, the actual resources provided by a common IT infrastructure can be efficiently managed and utilized through the deployment of multiple VMs, possibly from multiple different cloud computing customers. The virtualization of actual IT resources and management of VMs is typically provided by software referred to as a VM monitor (VMM) or hypervisor.
In a typical virtualized computing environment, VMs can communicate with each other and with physical entities in the IT infrastructure of the utility computing environment utilizing conventional networking protocols. As is known in the art, conventional networking protocols are commonly premised on the well-known seven layer Open Systems Interconnection (OSI) model, which includes (in ascending order) physical, data link, network, transport, session, presentation and application layers. VMs are enabled to communicate with other network entities as if the VMs were physical network elements through the substitution of a virtual network connection for the conventional physical layer connection.
In the current cloud computing environments in which data storage systems and host systems can be widely geographically and/or topologically distributed and the volume of data can be in the petabytes (i.e., a so-called “big data” environment), it is desirable to provide low latency access to frequently accessed data, while still retaining (e.g., archiving) less frequently accessed data. To provide such low latency access to stored data, it is conventional to implement multiple tiers of data storage, with storage devices having smaller storage capacities, higher performance, and higher per-byte cost at the upper tiers and storage devices having larger storage capacities, lower performance, and lower per-byte cost at the lower tiers. Data are then distributed among the tiers based on a “heat” metric providing an indication of the frequency and/or recency of access, with “hotter” data (i.e., more frequently and/or recently accessed data) placed in the upper tiers and “colder” (i.e., less frequently and/or recently accessed data) placed in the lower tiers.
In order to meet QoS requirements, cloud computing environments generally implement resource utilization limits to restrict the utilization of computational and storage resources by the various cloud consumers sharing the physical cloud infrastructure to within agreed limits. These resource utilization limits may include, for example, input/output (I/O) throttling limits specifying a maximum allowed throughput (e.g., expressed in I/O operations per second (IOPS)) for a throttling unit, such as a cloud consumer, host device, Internet Protocol (IP) address, application (workload), volume group, and/or volume.