Modern computing systems execute a variety of requests concurrently and operate in a dynamic environment of cooperative systems, each comprising of numerous hardware components subject to failure or degradation.
The need to regulate concurrent hardware and software “events” has led to the development of a field which may be generically termed “Workload Management.” For the purposes of this application, “events” comprise, but are not limited to, one or more signals, semaphores, periods of time, hardware, software, business requirements, etc.
Workload management techniques focus on managing or regulating a multitude of individual yet concurrent requests in a computing system by effectively controlling resource usage within the computing system. Resources may include any component of the computing system, such as CPU (central processing unit) usage, hard disk or other storage means usage, or I/O (input/output) usage.
Workload management techniques fall short of implementing a full system regulation, as they do not manage unforeseen impacts, such as unplanned situations (e.g., a request volume surge, the exhaustion of shared resources, or external conditions like component outages) or even planned situations (e.g., systems maintenance or data load).
Many different types of system conditions or events can impact negatively the performance of requests currently executing on a computer system. These events can remain undetected for a prolonged period of time, causing a compounding negative effect on requests executing during that interval. When problematic events are detected, sometimes in an ad hoc and manual fashion, the computing system administrator may still not be able to take an appropriate course of action, and may either delay corrective action, act incorrectly or not act at all.
A typical impact of not managing for system conditions is to deliver inconsistent response times to users. For example, often systems execute in an environment of very cyclical usage over the course of any day, week, or other business cycle. If a user ran a report near standalone on a Wednesday afternoon, she may expect that same performance with many concurrent users on a Monday morning. However, based on the laws of linear systems performance, a request simply cannot deliver the same response time when running stand-alone as when it runs competing with high volumes of concurrency.
Therefore, while rule-based workload management can be effective in a controlled environment without external impacts, it fails to respond effectively when those external impacts are present.
In addition, currently there is no effective way to access critical resources (for example, memory segments) in real-time such as Database System (DBS) global memory partitions (virtual processors [Vproc]—Partition Global), performance memory partition data (PMPC) or task partition (TskGlobal) data using SQL (structured query language commands such as select, update, delete) within the Teradata™ database architecture. Access to critical resource data is key to successfully managing a database system. However, accessing this data is difficult without using invasive tools and methods such as: PMPC, GDB (GNU Debugger), KDB (built-in Kernel Debugger), SDB (symbolic debugger), Crash, Coroner, Puma, Trace, etc. Methods such as these can be prohibitively expensive and intrusive to customers because it requires large machine resources (CPU [central processing unit], traces, etc) or requires the halting of tasks and/or threads which can stop the execution of the database.
In addition to this problem, there currently is no way to debug external routines such as SQL Stored Procedures (SPL), UDFs (user defined functions) or LOBs (large objects), CLOBS (character large objects), and/or BLOBS (binary large objects).
Ideally a database management system (DBMS) should be able to accept performance goals for a workload and automatically adjust its own performance “knobs” using the goals as a guide. Given performance objectives for each workload, the problem is further complicated by the fact that workloads can interfere with each other's performance through competition for shared system resources. Because of this interference, the DBMS may find a “knob” performance setting that achieves the goal for one workload but at the same time makes it impossible to achieve the goal for some other workloads. Further compounding the problem is the fact that system resources are of a finite number, with a limited number available to perform work on the system.
Accordingly, what is needed is the ability to manage critical resources in real-time to allow features such as Teradata Active System Management (TASM) the ability to dynamically manage database (DBMS) resources such as sessions, tasks, queues, access to CPU, I/O, etc. With this capability Teradata Active System Management can greatly improve Teradata's system management capabilities, with a focus on being able to dynamically manage the database system.
In addition, allowing access to critical database resources will allow DBAs, engineers, and third party tools, the ability to monitor and manage database machines using a different mechanism.
Accordingly, what is needed is the capability to manage and access critical database resources, in real time, in a multi-system environment.