Many Web 2.0 and Software as a Service (SaaS) applications rely heavily on user-created content. This reliance drives the need for (a) efficient and reliable scaling technologies for supporting rapid data growth; and (b) better storage and retrieval technology. Much of this user-created content only requires a primary key for store and retrieve commands rather than complex querying and management functionality offered by traditional Relational Database Management Systems (RDBMS's). The excess RDBMS functionality involves expensive hardware and highly skilled personnel, typically making it unsuitable for these types of applications. In-addition, RDBMS replication capabilities are limited and typically prefer consistency over performance and availability. Despite many developments in recent years, scaling-out a relational database is still very complex.
During recent years NoSQL (Not Only SQL) database management systems (which are also referred to as non-relational databases or unstructured databases) have emerged in-order to solve these RDBMS deficiencies. NoSQL is a broad class of database management systems that can differ from classic RDBMS in some significant ways: (1) there are no inherent relations between stored objects; (2) the data stores may not require fixed table schemas; and (3) NoSQL avoids join operations and typically scales horizontally.
In-memory non-relational databases are subset of NoSQL databases, and are designed in a way that all of (or a major part of) the users' dataset is stored in RAM Memory. In-memory non-relational databases are usually in two to three orders of magnitude faster (in terms of throughput and latency) than RDBMS's and an order of magnitude faster than other NoSQL databases.
Among the in-memory non-relational databases, the open source Memcached was first to emerge intending to solve many of the RDBMS issues of read operations, by adding to RDBMS a simple distributed key-value caching system. However, Memcached does not include a data-management layer, and therefore provides no support for high-availability and data-persistence. In addition, during scaling events, Memcached loses all, or significant part of its data.
Redis, an emerging open-source in-memory non-relational database improves Memcached's offering by supporting write operations, persistence storage and high-availability, using a data management-layer for the stored objects. But Redis is built over a single master multi-slave architecture, and therefore suffers from master scaling problems.
Furthermore, due to the relatively high price of RAM resources (as of July 2011, RAM prices are ˜300 times higher than HHD (Hard Disk Drive) and ˜30 times higher than SSD (Solid State Disk)), in-memory non-relational databases are very expensive.
Accordingly, there is a need for improved mechanisms for providing in-memory non-relational databases.
Summary
Systems, methods, and media for providing in-memory non-relational databases are provided. In some embodiments, methods for providing an in-memory, non-relational database are provided, the methods comprising: providing a first control process that executes in a hardware processor; providing a first server process that executes in a hardware processor, that responds to write requests by storing objects in in-memory, non-relational data store, and that responds to read requests by providing objects from in-memory, non-relational data store, wherein the objects each have an object size; forming a plurality of persistent connections between the first control process and the first server process; using the first control process, pipelining, using a pipeline having a pipeline size, requests that include the read requests and the write requests over at least one of the plurality of persistent connections; using the first control process, adjusting the number of plurality of persistent connections and the pipeline size based on an average of the object sizes; and using the first control process, prioritizing requests by request type based on anticipated load from the requests.
In some embodiments, non-transitory computer-readable media containing computer-executable instructions that, when executed by a processor, cause the processor to perform a method for providing an in-memory, non-relational database are provided, the method comprising: providing a first control process that executes in a hardware processor; providing a first server process that executes in a hardware processor, that responds to write requests by storing objects in in-memory, non-relational data store, and that responds to read requests by providing objects from in-memory, non-relational data store, wherein the objects each have an object size; forming a plurality of persistent connections between the first control process and the first server process; using the first control process, pipelining, using a pipeline having a pipeline size, requests that include the read requests and the write requests over at least one of the plurality of persistent connections; using the first control process, adjusting the number of plurality of persistent connections and the pipeline size based on an average of the object sizes; and using the first control process, prioritizing requests by request type based on anticipated load from the requests.
In some embodiments, systems for providing in-memory non-relational databases are provided, the systems comprising: at least one hardware processor that executes a first control process; executes a first server process that responds to write requests by storing objects in in-memory, non-relational data store, and that responds to read requests by providing objects from in-memory, non-relational data store, wherein the objects each have an object size; forms a plurality of persistent connections between the first control process and the first server process; uses the first control process, pipelining, using a pipeline having a pipeline size, requests that include the read requests and the write requests over at least one of the plurality of persistent connections; uses the first control process, adjusting the number of plurality of persistent connections and the pipeline size based on an average of the object sizes; and uses the first control process, prioritizing requests by request type based on anticipated load from the requests.