Currently, memory overcommitment (on-demand allocation) is implemented for memory management of a virtual machine in the industry. That is, a certain proportion of memory (for example, 1 GB) is first allocated, according to a certain proportion, to the virtual machine according to a memory specification (for example, 2 GB) of the virtual machine, and then dynamic adjustment (for example, 1.5 GB or 2 GB) is performed according to the memory that is being used when the virtual machine is running and the physical memory is allocated to the virtual machine on demand. The specification of the virtual machine is reached at utmost. However, in a situation in which the actual physical memory resources are in short supply, different virtual machines have the same capability of allocating the memory, and differentiated treatment cannot be provided for users with different requirements in the use of the memory resources.
In an existing technical solution, a memory monitor thread of a virtual machine monitors memory use of the virtual machine in real time; when detecting that the memory usage of the virtual machine keeps surging in specified time, the memory monitor thread adjusts the memory of the virtual machine to prevent application collapse caused by exhaustion of the memory of the virtual machine. The solution implements flexible management of memory resources, so that a specification and a capacity of a cloud computing system can be improved.
In the existing solution, when the memory of a batch of virtual machines keeps surging, all virtual machines are treated in the same way. As a result, a timely response may not be received for a virtual machine adjustment request for deploying a key application or service, and therefore an entire service process is affected.