In view of the growing sizes of data sets used for data analysis and the mode in which some of the large data frameworks operate, the availability of memory, such as random access memory (RAM) in the computing framework is an ongoing constraint. Upgrading/downgrading/allocating, etc., the amount of RAM on a per-session or per-job basis is not easily performed due to constraints with hardware interfaces and other physical constraints.
While the framework of any network may benefit from a larger RAM ‘footprint’ it is not always easy to create larger RAM footprints as RAM is always limited by physical constraints and costs. Servers with larger RAM footprints usually dissipate more heat, and the cost of such servers can go up non-linearly as the size of RAM footprints is increased. The static nature of RAM limits the availability of such resources in networks with multiple computing nodes and varying levels of operation.
Additionally, with cost and power dissipation issues aside, providing large RAM footprints alone may not solve certain underlying memory constraints. The compute frameworks may operate inside systems, such as a JAVA virtual machine (JVM), which can be limited from an effect known as garbage collection (GC). For instance, the more RAM provided, the larger the JVM spends time performing GC, during which, the system can freeze for brief periods of time. If such an action occurs frequently, the results will be an increase in the job completion time.