In the early days of modern computing, the large size and cost of computers tended to result in a concentration of computer equipment in large data centers. Most commercial users of computers would have paid one or more data center operators to perform their computing tasks. Over the past decades, miniaturization and cost decreases have driven a trend toward more commercial computer users owning and operating their own computer systems. This trend is not universal, however.
One exception includes those computer users whose computing needs are particularly massive and/or require exceptional reliability, redundancy or security. Examples of such users include very large corporations, and especially financial sector corporations such as banks, exchanges, brokerages and the like. These types of corporations will often contract with third-party providers to supply their computing needs.
The preeminent example of a third party provider of computing services is the International Business Machines (IBM) Corporation. IBM has several thousand users who pay for the capability and reliability of its System z (“z” standing for “zero downtime”) computing platform. The way users rely on the performance of the workload on the System z platform is illustrative background for the present invention's system and method for managing computer usage.
Each group of logically-related computing functions being performed for a user is referred to as a logical partition (LPAR). IBM allows users to manually set a usage limit for each LPAR, referred to as a Defined Capacity (DC). IBM also allows users to manually set a usage limit for a group of LPARs, referred to as a Group Capacity Limit (GCL—see, e.g., U.S. Pat. No. 7,752,415, the contents of which are herein incorporated by reference in their entirety). The group typically consists of all the LPARs being run on a given machine (i.e., central electronic complex—CEC), or much less frequently, a group of all LPARs being run on a given CEC that run the same type of workload. When all system parameters reflect realistic settings, capacity limitations imposed as a result of DC and GCL settings will affect the lowest workload classes (first IMP6 (also called Discretionary), then IMP5, then IMP4, etc.); however, there is no discrimination based upon what workload actually falls within those importance classes.
Workload itself enters the system using a “service class”. Within each service class the workload is allocated to one of multiple workload importance levels, e.g. the first part of the workload to a high class but the longer the workload takes to execute, the lower the workload importance level gets. When classifying service classes, the following factors are important:
How time critical is the workload?                Workload that is most time critical runs in service classes that are assigned to importance level 0 (IMP0), then importance level 1 (IMP1) etc. until the workload that is least time critical is assigned to importance level 6 (IMP6, Discretionary).        
Which performance goal does the user want the workload to achieve?                Within each service class users can define a performance goal, e.g. by defining that a percentage of the workload is expected to be finished within a certain time or using a certain defined processing capacity only (e.g. the user could define that he would like 90% of the online transactions to be finished within 0.01 seconds (clock time)).        
Service class settings need to be reviewed and adjusted in a permanent process to the ever changing workload requirements.
The performance of the workload on the System z platform is determined by several factors, but most significant are the service class and the above mentioned capacity limitations using DC and GCL and/or hardware limitations.
WLM—Workload Manager—is an integrated part of the System z software. One of the functions of WLM is to monitor if the service class goals of the current workloads are being met by looking at performance indicators, such as but not limited to the ‘Performance Index’ (PI), MSU Activity or Delay Counter.
The consequences for workload in service classes that do not meet the performance criteria are:                In case of chronic and significant overachievement (the performance indicators are always very positive, the workload finishes faster than expected and/or defined in the goal), while at the same time recognizing that there is not enough capacity to fulfill all workload requirements (e.g. when the LPAR or group capacity is limited by DC or GCL or when the physical machine limit is reached), WLM undertakes actions that lead to degraded performance.        In case of chronicle and significant underachievement (the performance indicators are always very negative, the workload generally takes longer than expected and/or defined in the goal), while at the same time recognizing that there is not enough capacity to fulfill all workload requirements (e.g. when the LPAR or group capacity is limited by DC or GCL), WLM also undertakes actions that lead to degraded performance.        
In order to guarantee optimized and reliable performance, it is therefore very important that the service class settings are realistic, especially during times of capacity shortage.
It is also important to consider that the cost to use the System z platform is determined by several factors, but a significant recurring cost for almost every user is the monthly license charge (MLC). This charge is applied for usage of proprietary IBM products that are used in connection with System z, such as the operating system (z/OS), information management system (IMS), customer information control system (CICS), relational database (DB2) and the like.
The MLC varies depending upon the usage of the IBM products during the billing period (i.e., 2nd of month at 0:00 until 1st of following month at 24:00). More particularly, product usage is measured in millions of service units (MSU), and the MLC is determined based on the highest average MSU usage of any full 4 hour period within the billing period (referred to as 4 Hr Rolling Average Usage). Thus, controlling the MLC involves, in large part, controlling peak MSU usage. DC and GCLs can be used to help control this.
While manual intervention gives users some ability to manage workload performance and costs incurred, further improvements are possible.