Data requirements are estimated to be growing at an annual rate of 60 percent, and this trend is driven further by cloud computing platforms, company consolidation, huge application platforms (like Facebook), etc. Server-class machines purchased this year have a minimum of 8 gigabytes (GB) of RAM and likely have 32 GB or RAM. As one example, Cisco is now selling mainstream Unified Computing System (UCS) boxes with over 380 GB of RAM. As another example, users can borrow 68.4 GB machines for $2/hour on EC2.
In a common approach, many operating systems attempt to speed up operations by caching data on a local machine, e.g., in connection with the machine's heap. “Moving” data closer to the application that executes on it can result in efficiency gains. This conventional thinking oftentimes leads to the conclusion that the cache should be as large as possible. However, applications that execute on garbage-collected runtimes face an increasing challenge to handle the ever-increasing amounts of data and leverage the fast-growing amount of RAM on modern computer systems. As is known, garbage collection is a part of automatic memory management implemented, for example, by Java. Garbage collection involves determining which objects can no longer be referenced by an application, and then reclaiming the memory used by “dead” objects (the garbage). But complexities arise in determining when, for how long, and how often, garbage collection activities are to take place, and this work directly impacts the performance and determinism of the running application.
Furthermore, an unfortunate side-effect of increasing the size of the cache for garbage-collected runtimes is that with the large heaps needed for large caches, Java-based environments slowdown at an exponential rate with much, if not all, of the slowdown being directly attributable to Java's garbage collection. A heap size of 2-4 gigabytes (GB) oftentimes is manageable, and some further amount can be considered usable if specialized modifications are made. But custom modifications may be time consuming and technically challenging. There therefore oftentimes is a practical (and oftentimes recommended) 6 GB limit to Java heaps, although slowdowns frequently occur well before this maximum is ever reached. Slowdowns can halt all or substantially all processes that are executing. For large heaps, it is not uncommon to observe a 10 second delay in which nothing happens, although minute-long delays are not unheard of. These sorts of delays can be particularly problematic for web services, mission critical applications, and/or the like.
Challenges result from the increasing garbage collection pauses or delays that occur as runtime heaps become larger and larger. These delays may be unpredictable in length and in occurrence. Thus, as the data/memory explosion is occurring, the amount of the heap a garbage-collected runtime process can effectively use has stayed largely unchanged. In other words, although the amount of space available is growing, it oftentimes is challenging to use it in an efficient and cost-effective way.
These problems manifest themselves in several ways and can be caused in several common scenarios. A first problem relates to applications running too slowly. For example, an application might not be able to keep up with the users (e.g., with 10s of GBs of data in a database, the application may be overloaded and/or too slow to service the needs of users), which may be caused by the complicated nature of queriers, the volume of those queries, and/or the like. Caching may help by moving data “closer” to the application, but too many Java garbage collection pauses may be incurred if the cache is grown too large (e.g., to approximate the 16 GB of RAM in a hypothetical system).
Another common problem relates to unpredictable latencies that can affect the application. An application might be sufficiently fast on average, but many pauses that deviate from the mean may be unacceptable to my users. Service Level Agreements (SLAs) may not be met because of the size of my heap, combined with Java garbage collection pauses.
Still another common problem relates to complicated software/hardware deployment. It may be possible to “solve” the Java garbage collection problems, e.g., by running with many Java Virtual Machines (JVMs) with heap sizes of 1-2 gigs. Data can be partitioned and/or load balancing can be performed to achieve the performance and availability desired. However, setup may be complicated to manage because so many JVMs are needed, and checks must be performed to ensure that the right data is in the right places. Thus, while 64 GB of RAM can be filled, it nonetheless may be too hard to manage and too fragile to be implemented reliably.
Currently, users are forced to select one of three options when dealing with Java applications. The base case involves a small heap JVM on a big machine. Recognizing that garbage collection pauses are a problem, garbage collection is reduced by implementing, e.g., a 4 GB JVM on a 32 GB machine. Development and operational complexity is low, but performance may suffer. A second option involves implementing a large heap of, for example, up to 31 GB in a 32 GB machine. While the intention is to move the data closer to the application, the garbage collection delays can be extremely high and very complicated to manage. Development and operational complexity also may be very high.
A third option involves stacked, small JVM heaps. For example, eight 4 GB JVMs may be implemented. This approach is oftentimes used in combination with various sharding, load balancing, and clustering techniques. Unfortunately, however, it is very complicated to manage this environment. Availability problems also can be encountered if all or most of the nodes garbage collect at the same time.
Thus, it will be appreciated that there is a need in the art for alleviating the problems faced by garbage-collected runtimes. It also will be appreciated that there is a need in the art for systems that are able to handle increasing amounts of data in a manner that makes use of the growing amount of memory (RAM or disk) in computer systems.
These example problems mentioned above were present in the very first Java release and have not been fully addressed since then. Thus, it will be appreciated that there has been a long-felt need in the art for solutions to these and/or other related problems.
It is believed that part of the reason for the long-felt need is that prior attempted solutions have tried to rely on either operating systems (OS) approaches, or programming language (PL) approaches, for solving these and related problems. The inventors of the instant application have realized, however, that what is needed is a more holistic approach that blends in elements from both of these art areas. Thus, as explained in much greater detail below, the example embodiments described herein belong to an art area that is neither OS-related nor PL-related but instead can be viewed as something above both OS and PL (or managed runtime) layers.
More specifically, it will be appreciated that it would be desirable to provide a stand-alone caching solution is capable of holding a large dataset (e.g., from 10s to 100s of GBs) in memory without impacting garbage collection. The more data that is cached, the less that the application has to go to the external data source and/or disk and, thus, the faster the application will run. In a similar vein, it would be desirable to provide fast execution that meets SLAs, and that also stays fast over time. This can be achieved by reducing the amount of fragmentation and avoiding or at least reducing slowdowns as the data is changed over time. It also would be advantageous to provide an approach that is concurrent such that, for example, it scales with CPU numbers and powers, and the number of threads, while also avoiding or reducing lock contention. The solution advantageously would be predictable. It also would be advantageous to provide an approach designed to work with the programming language and/or operating system environment (e.g., providing a 100% Java solution to work within a JVM). This may help with snap-in functionality that does not introduce a large amount of complexity. It also would be desirable to provide a restartable solution, as a big cache may otherwise take a long time to build.
Most people incorrectly think that collecting dead objects takes time, but it is the number of live objects that actually has the greatest effect on garbage collection performance. As the Java heap becomes occupied with an increasing number of live objects, full collections occur more often and will each require more time to complete. The result is an increasing number of stop-world pauses in an application, for increasing lengths of time. In general, the larger the heap, and the more occupied it becomes, the greater the latencies in the application. Certain example embodiments help to avoid large, occupied heaps typical of large data caches while also reducing garbage collection related pauses.
One aspect of certain example embodiments relates to a highly-concurrent, predictable, fast, self-managed, in-process space for storing data that is hidden away from the garbage collector and its related pauses. In certain cases, the space may be self-tuning, and may connect to frameworks in ways that require no or substantially no changes to a user's application code. In this regard, in certain example embodiments, the space may “sit behind” standard interfaces such as, for example, Map, HttpSessions, Ehcache, and/or the like.
Another aspect of certain example embodiments relates to techniques that add scale-up features (e.g., the ability to improve performance by growing an individual machine) and predictability to servers and applications in the context of, for example, a clustering technology that provides high-availability scale-out (e.g., the ability to bring multiple connected machines to bear on a problem) for applications.
An advantage of certain example embodiments relates to the ability to integrate such functionality without having to change user code, and instead by adding a line of configuration and, potentially, a provided code module for referencing an off-heap store. This may, in turn, layer in a predictable, fast, highly-concurrent, off-heap store for garbage collected runtimes, without a significant amount of required tuning. By adding in an off-heap data store in accordance with certain example embodiments, the runtime's garbage collector can focus on a small heap needed for operations (which is something runtimes are very good at), while possibly leaving the rest of the data structures to be efficiently and completely (or substantially completely) managed by the off-heap store.
Another aspect of certain example embodiments relates to the ability to shrink the heap size and grow the cache.
Still another aspect of certain example embodiments relates to the possibility of providing fast swaps to disk and quick restartability.
In Java, off-heap memory is provided by the operating system (OS) via the java.nio.buffer.ByteBuffer class. Creating and destroying “direct” ByteBuffers ordinarily fragments the OS memory and makes off-heap memory allocation slow and unpredictable. To help avoid this situation, when certain example embodiments first start executing (e.g., at construction time), direct BBs are created that take up the entire off-heap memory space. Certain example embodiments then use their own memory manager to manage the ByteBuffers. Because the ByteBuffers are never destroyed (at least not until the Java process is completely done with them), the OS memory manager is never invoked. As a result, off-heap memory allocation is faster and more predictable.
Certain example embodiments include a memory manager that enables fast and predictable allocation. For example, allocation is performed in variable-sized chunks. The required amount of memory is requested from the OS in chunks as large as possible, and bounds on the chunk sizes are specified at construction. Allocation then proceeds starting at the upper bound. On an allocation failure, the bound size is reduced, and allocations continue at the new lower value, possibly until a lower threshold is met or surpassed.
The memory manager of certain example embodiments may allocate memory from direct ByteBuffers as Pages, with each Page being sourced from a single ByteBuffer. If appropriate space is not available, then an in-use Page may be “stolen” and used for the requested allocation. Each Page allocation request may include parameters such as, for example, thief, victim, and owner. The thief parameter may indicate whether an in-use Page should be stolen (if necessary) to meet the allocation request. The victim parameter may indicate whether this Page (after being allocated) should be stolen (if necessary) to meet another allocation request. The owner parameter may indicate an owner of this Page so that the owner can be notified if the Page is later stolen. The thief parameter and the victim parameter can be Boolean (true/false, yes/no, etc.) values, or numeric values that indicate relative priority in different embodiments.
In certain example embodiments, a computer system comprising at least one processor is provided, A non-transitory computer readable storage medium tangibly stores data. A software application is executable by the at least one processor and programmed to make use of the data. Off-heap memory is dynamically allocated and directly managed by a memory manager, such that the off-heap memory is perceivable by the software application as being a part of local application tier memory and manageable, after initial allocation, independent of any memory managers of the computer system and any memory managers of an operating system running on the computer system. The off-heap memory is scalable up to a size of the computer system's memory, upon direction from the memory manager, to accommodate terabytes-worth of data so that that data stored in the off-heap memory is transparently providable to the software application from the off-heap memory within microseconds and without having to repeatedly access that data from the non-transitory computer readable storage medium.
In certain example embodiments, there is provided a method of managing memory of a computer system including at least one processor, a non-transitory computer readable storage medium tangibly storing data, and a software application executable by the at least one processor and programmed to make use of the data. An off-heap direct memory data storage area is dynamically allocated and directly managed, using a memory manager, such that the off-heap direct memory data storage area is perceivable by the software application as being a part of local application tier memory and manageable, after initial allocation, independent of any memory managers of the computer system and any memory managers of an operating system running on the computer system. The off-heap direct memory data storage area is scalable up to a size of the computer system's memory, upon direction from the memory manager, to accommodate terabytes-worth of data so that that data stored in the off-heap direct memory data storage area is transparently providable to the software application from the off-heap memory within microseconds and without having to repeatedly access that data from the non-transitory computer readable storage medium.
The method may operate in connection with a Java-based environment, and may further comprise: (a) attempting to allocate Java byte buffers in chunks of a preconfigured maximum size in response to a request for off-heap direct memory data storage at a predetermined maximum size; (b) repeating said attempts to allocate byte buffers until the off-heap direct memory data storage area is created at the predetermined size, or until an attempt fails, whichever comes first; (c) when an attempt to allocate byte buffers fails, reducing the preconfigured maximum size and repeating (a)-(b); (d) receiving a request for a region of the off-heap direct memory data storage area, the region having an associated size; (e) finding, via a page source, an unused slice of the off-heap direct memory data storage area; (f) returning a page indicative of the unused slice, the page being a wrapped byte buffer that includes a reference to the slice where data is to be stored and a reference to an allocator object that created the slice; (g) continuing to return pages until the off-heap direct memory data storage area is exhausted; (h) managing the returned pages from the off-heap direct memory data storage area as a single coherent logical address space storing data keys and values, with a single page in the off-heap direct memory data storage area storing a hash table with metadata information linking data keys to values; and optionally (i) expanding and contracting the hash table in response to further entries being added thereto and removed therefrom, respectively, by rehashing into a new page.
In certain example embodiments, a computer system is provided. A plurality of computer nodes are provided, and an application is executable across the plurality of computer nodes in a Java Virtual Machine (JVM) environment. Each computer node comprises at least one processor; memory management software; and an off-heap direct memory data storage area dynamically allocated and directly managed by the memory management software of the associated computer node, with the off-heap direct memory data storage area being scalable upon direction from the memory management software of the associated computer node to accommodate terabytes-worth of data so that that data stored in the off-heap direct memory data storage area is providable therefrom without having to repeatedly access that data from a non-transitory computer readable storage medium or a network storage location.
In certain example embodiments, a system is provided. An application is executable on at least one computer. A server array of independently scalable coordinated memory managers and associated data storage nodes also is provided. Each said data storage node comprises a non-transitory computer readable storage medium tangibly storing data usable by the application. Each said memory manager comprises: at least one processor, and off-heap memory dynamically allocated and directly managed by the memory manager. The off-heap memory is scalable upon direction from the memory manager to accommodate terabytes-worth of data so that that data stored in the off-heap memory is providable from the off-heap memory without having to repeatedly access that data from the non-transitory computer readable storage medium of the node. The at least one computer includes program logic configured to automatically initiate a request for data from the server array when required data is not present in cache on the at least one computer, the request being transparent to the application.
According to certain example embodiments, the at least one computer may include a plurality of computers and the application may be executable across the plural computers.
According to certain example embodiments, each said computer may have its own memory manager for creating and managing an off-heap direct memory storage area thereon. For instance, according to certain example embodiments, each computer may include at least one processor; memory; computer-specific memory management software; and computer-specific off-heap direct memory data storage area dynamically allocated and directly managed by the computer-specific memory management software of the associated computer, with the computer-specific off-heap direct memory data storage area being scalable upon direction from the computer-specific memory management software of the associated computer to accommodate an amount of data up to the size of the memory of the associated computer.
It also is noted that certain example embodiments relate to methods of operating the various systems, memory managers/memory management software components, etc.
These features, aspects, advantages, and example embodiments may be used separately and/or applied in various combinations to achieve yet further embodiments of this invention.