Increasingly organizations are generating business value by performing analytics on data generated by both machines and humans. Not only are more types of data being analyzed by analytics frameworks, but increasingly users are also expecting real-time responses to their analytic queries. Fraud detection systems, enterprise supply chain management systems, mobile location based service systems, and multi-player gaming systems are some examples of applications that utilize realtime analytics capabilities.
In these systems, both transaction management and analytics related query processing are generally performed on the same copy of data. These applications generally have very large working sets of data and generate millions of transactions per second. In many cases these applications cannot tolerate significant network and disk latencies, and therefore employ main memory architectures on the application host computing device to fit the entire working set in memory, such as in an in-memory database.
Even though entire working sets are often stored in main memory, to protect from failure of the application host computing device, many applications also store a copy of their data off the application host computing device. Typically, copies are stored on disk or flash-based storage server devices because these technologies are relatively less expensive than main memory that may be available in a peer device, for example. Thus, there is often a bifurcation of Input/Output Operations Per Second (IOPs) optimized data management at the application host computing device and capacity-optimized data management at the backend disk or flash-based storage server devices.
This bifurcation results in a mismatch in the fine-grained data management model on the application host computing device and the block-optimized data management model in the backend disk or flash-based storage server devices. Currently, many application and middleware developers are forced to map their in-memory fine-grained data structures onto intermediate block-I/O-friendly data structures.
The in-memory data structures are part of memory pages that are, in turn, mapped to disk blocks using data structures such as binary trees, for example, that have been designed to localize updates to a block in order to minimize random I/Os to the disk-based storage server devices. In order to facilitate this mapping and backup to persistent storage, entire pages of memory in which updated bytes of data reside must be transferred, which is undesirable and often requires a significant amount of time, bandwidth, and other resources.