Field of the Invention
Embodiments of the invention generally relate to data analysis and, more specifically, to techniques for providing a shared cache as a zero copy memory mapped database.
Description of the Related Art
Some programming languages provide an execution environment that includes memory management services for applications. That is, the execution environment manages application memory usage. The operating system provides each process, including the execution environment, with a dedicated memory address space. The execution environment assigns a memory address space to execute the application. The total addressable memory limits how many processes may execute concurrently and how much memory the operating system may provide to any given process.
In some data analysis systems, applications perform queries against a large common data set, e.g. an application that performs financial analyses on a common investment portfolio. In such a case, the financial analysis application may repeatedly load portions of the entire data set into the application's memory or the application may load the entire expected data set. Frequently, even if multiple applications analyze the same data set, the data is loaded into the memory address space of each application. Doing so takes time and system resources, which increases system latency and effects overall system performance. The amount of memory in a system limits the number of execution environment processes that can run concurrently with memory address space sizable enough to allow the application to load an entire expected data set.
The scalability of the system is limited as the expected data set grows, because the system has to either reduce the number of applications that can run concurrently or increase the rate at which portions of the expected data set must be loaded, causing overall system performance to degrade.