Modern database systems execute a variety of query requests concurrently and operate in a dynamic environment of cooperative systems, each comprising of numerous hardware components subject to failure or degradation. The need to regulate concurrent hardware and software ‘events’ has led to the development of a field which may be generically termed ‘Workload Management’.
Workload management techniques focus on managing or regulating a multitude of individual yet concurrent requests in a database system by effectively controlling resource usage within the database system. Resources may include any component of the database system, such as CPU (central processing unit) usage, hard disk or other storage means usage, or disk I/O (input/output) usage. Workload management techniques fall short of implementing a full system regulation, as they do not manage unforeseen impacts, such as unplanned situations (e.g., a request volume surge, the exhaustion of shared resources, or external conditions like component outages) or even planned situations (e.g., systems maintenance or data load).
A database of a database system is a collection of stored data that is logically related and that is accessible by one or more users or applications. A popular type of database is the relational database management system (RDBMS), which includes relational tables, also referred to as relations, made up of rows and columns (also referred to as tuples and attributes). Each row represents an occurrence of an entity defined by a table, with an entity being a person, place, thing, or other object about which the table contains information.
One vital performance metric to monitor is skew of resource usage across the different parallel components of the database system. As a database becomes larger, the more impactful skew can be on the overall system. Database systems with tens or hundreds of nodes can start to lose significant portions of their overall processing capacity when work becomes skewed to a portion of the nodes in the system.
There are a number of different resources which can be monitored for effective parallel usage across the database system. These different resources include CPU usage, disk I/O usage, memory usage, and network usage, for examples. Monitoring of this data for a large database system may not immediately reveal any skew problem because such data is usually presented in ordered lists of raw data values.
All database systems have some amount of skew. A person monitoring a database system sometimes tries to evaluate how much skew is acceptable and how much skew is too much. This person though is unable to make such an evaluation by just simply looking at data contained in a list of raw data values. It would be desirable to enable a user to quickly and easily evaluate a potential skew condition (i.e., is the amount of skew acceptable or unacceptable) in a large parallel data processing system.