1. Field of the Invention
The present invention generally relates to parallel processing in large-scale information systems. More specifically, the present invention provides a method (and system) for addressing inconsistency in large-scale information systems without synchronization.
2. Description of the Related Art
Computationally demanding business analytics applications will soon have systems with petabytes of memory and tens of thousands of processor cores. These applications need to respond rapidly to high-volume, high-velocity, concurrent request streams containing both queries and updates. Typical synchronization techniques are prohibitively expensive when scaling to these levels of parallelism. Even if synchronization operations themselves were to remain efficient, Amdahl's law would limit the scalability of performance. Without synchronization, however, errors will arise from data races.
Business analytics broadly is a source of ever more challenging performance and scalability requirements. Within business analytics, financial performance management is one example of a demanding business application domain. On a regular basis (e.g., quarterly) enterprises engage in financial planning, forecasting, and budgeting. End-users explore the enterprise's finances, compare plans to actual results, construct financial models, and experiment with what-if-scenarios, refining the plan in an iterative fashion. As deadlines approach in a globally distributed enterprise, thousands of users could be collaborating remotely. Some users review continually-updated high-level consolidated financial performance measures aggregated from tens of gigabytes of data. Other users perform localized detailed data entry and review, continually refining the data as they strive to meet targets. Ultimately, they all converge to a finalized plan.
One advanced technology that can support this kind of financial performance management scenario is interactive (or “real-time”) concurrent update (or “write back”) in-memory Multidimensional Online Analytic Processing (“MOLAP”) cube technology. Distinguishing features of this technology compared to other forms of Online Analytic Processing (“OLAP”) technology, such as Relational Online Analytic Processing (“ROLAP”), Hybrid Online Analytic Processing (“HOLAP”), and other known techniques, include the ability for users to change previously-entered cube data values on the fly and enter new data values into the cube, the ability to have those changes reflected in computed or aggregated data values whenever the user next requests that data values be recalculated, sufficient query response time for interactive use, and the ability to allow multiple users to access a set of linked cubes concurrently. To support these features, the implementation keeps all data in memory, does not pre-compute aggregations or materialized views, calculates and caches aggregated and computed values on demand, and employs highly efficient and compact data structures to represent the highly multidimensional and sparse data sets.
A problem with the above technique arises when multiple threads simultaneously update the data and caches of a cube. In this situation, the cached value of a cell in the cube may become permanently out of date or stale.
Conventionally, this problem is solved by locking. While locking will prevent this problem from occurring, locking causes the system to become slow and prevents scaling by limiting concurrent access to the set of linked cubes.