Businesses that want to reduce operating expenses and be able to scale resources rapidly are generally utilizing Infrastructure-as-a-Service (IaaS) cloud providers, which deliver computing, storage, and networking resources to the businesses. Businesses that deal with collections of data sets that are significantly large and complex that it becomes difficult to process using conventional database management tools are generally implementing big data frameworks and solutions that support scalability and provide intensive processing power. Quality of service (QoS) aware job scheduling is generally crucial to cloud data environments and big data environments for improving serviceability, reliability, and accountability. Some traditional cloud and big data frameworks offer some form of QoS awareness by allowing users to specify resource quota. For example, traditionally, QoS has been specified as the number of CPUs, amount of memory, and disk I/O bandwidth, etc.
Traditional QoS solutions generally provide little information about the nature of the jobs, and end users typically are not familiar with how these resource quotas are translated to their perceived quality of service. The conventional and numerous metrics tend to become cumbersome for both application authors to understand and cloud operators to deploy. Current hardcoded quotas are generally either hard to satisfy or fail to provide the expected service guarantee if the quotas are not set properly. Various cloud and big data frameworks also use different resource quotas and are generally not perceived the same way throughout different components in an environment. If new types of resources (e.g. new storage media or persistent memory) is introduced into a cloud or big data environment or if existing resources are upgraded, the traditional hard coded QoS quota systems also need to be updated, which may contribute to service disruption. As cloud and big data environments exponentially grow, the measuring, tracing, and enforcing or resource quotas in traditional solutions are hard to implement in a scalable manner.