Multiple applications running on a computer system share the need for certain system resources such as, but not limited to, central processing unit (CPU) time, memory (particularly but not limited to random access memory (RAM), input/output (I/O) bus operations, and even other applications or processes. A problem can occur if too many applications, even if well-behaved, are competing for a shared system resource, or if an application is not well-behaved and attempts to consume an excessive portion of the shared system resource(s).
For example, a company can provide a cloud service that allows consumers to access popular applications, such as but not limited to an email application, a word processing application, a financial spreadsheet application, a game, a drawing program, etc. Each time a consumer accesses the application, a new instance of that application can be opened on a virtual machine, with many virtual machines running on a server operated by that company, and with there often being numerous such servers on one or more server farms.
If only one or two consumers are accessing an application, for example the email application, then the server running the virtual machines that are running the instances of the email application will not be burdened, at least not by those instances. If, however, a thousand, two thousand, five thousand, etc., consumers are simultaneously accessing the email application, then the server(s) involved might not have sufficient resources to accommodate these numerous virtual machines and/or instances of the email program, in which case the system might be in “distress”, as manifested by, for example, perceived system performance, e.g., speed will suffer and, in extreme cases, errors can occur, e.g., time-out errors due to the instance of the application not responding in time to a remote system.
Further, as consumers in a particular geographic area have roughly similar schedules, e.g., similar working hours, similar evening hours at home, there will be peak usage times when large numbers of consumers on simultaneously online, such as the evening hours, and other times when very few consumers are on online, such as between midnight and dawn. The number of instances of an application being in use can vary greatly depending upon the time of the day, the day of the week, the month, etc.
In addition, cloud services often employ continuous deployment and installation of updates to applications in order quickly make releases of new features available to the consumers, often without any change in the capacity of the system resources. If an update is not well-behaved and attempts to consume an excessive portion of the shared resource(s) then system distress can occur, i.e., system performance can suffer and errors can occur.
Further, as computers become faster, applications, processes, and updates associated with applications can be written with a particular new processor capability and speed and memory speed (resources) in mind, but such applications and processes actually can be installed and run on older systems that do not have those enhanced resources. Thus, such applications and processes can attempt to consume an excessive portion of the shared resource(s), especially on, but not limited to, older systems.
Such excessive system resource consumption can exhaust the system resources, and seriously slow other applications or processes, including system processes, even to the point where failures and cascading failures occur. This can result in loss or corruption of data and even prevent system error logging, thereby making it difficult or impossible to determine the cause of the failure.