The present invention relates generally to the field of computer memory, and more particularly to allocation of memory in a server level, rack, and/or cluster farm.
Today it is common to run a big workload with hundreds of servers as a cluster. When a particular workload consumes almost all the memory, the system cannot leverage the remaining computing power on that server although the CPU utilization is still low. The typical approach to solve this problem is to transform the workload into a scale-out design so that the small workload can be dispatched to other servers in this cluster. However, there are still two problems this approach. First, if this workload is a memory-intensive workload that cannot be transformed into a scale-out architecture, this will become a hot-spot in the system performance. Although other servers may have a lot of free memory during non-peak or even idle time, such memory cannot be shared. Second, even though small-granularity workload can be distributed to other servers via delicate software architecture design, this will rely on a cluster scheduler to perform this task. It means the transfer of workload state/data across the nodes. In a big data scenario, this violates the “move computing close to data” principle. In both scenarios, these will lead to the waste of computing and memory resource in the cluster.