On-Line Analytical Processing (OLAP) generally refers to a technique of providing fast analysis of shared multi-dimensional information stored in a database. OLAP systems provide a multi-dimensional conceptual view of data, including full support for hierarchies and multiple hierarchies. This framework is used because it is the most logical way to analyze businesses and organizations.
Unfortunately, it is difficult to handle large volumes of multi-dimensional information in a computer. The first problem is one of size. While small volumes of multi-dimensional data can be handled in Random Access Memory (RAM), this technique does not work for large problems. Multi-dimensional information is typically very large. Because OLAP is used for interactive analysis, it must respond very rapidly to queries, even when the data volumes grow.
The second problem is that multi-dimensional data is almost always sparse. In fact, in large multi-dimensional applications, it is not unusual to have only one cell populated for every million cells that are defined.
The sparsity problem discourages the storage of data in simple, uncompressed arrays, except in certain special cases. The simplest way of dealing with sparse data might seem to be to store only cells containing data in some indexed form. However, this approach has two problems. First, the index and keys are likely to take much more space than the data. Moreover, it is relatively time consuming to search them. Second, access will be inefficient because multi-dimensional data is often clustered, consisting of regions of relatively dense data separated by large, extremely sparse or totally empty sections. As a result, related data cells are unlikely to be placed physically close to each other.
A typical prior art approach to these problems is to break the data into smaller, denser multi-dimensional objects. Some techniques do this implicitly, presenting all the data to the user in what is known as a "hyper-cube" format in which all the data in the application appears to be in a single multi-dimensional structure. Other techniques do it explicitly in what is known as the "multi-cube" approach, in which the multi-dimensional database consists of a number of separate objects, usually with different dimensions. That is, the database is segmented into a set of multi-dimensional structures, each of which is composed of a subset of the overall number of dimensions in the database. Each segmented structure might be, for example, a set of variables or accounts, each dimensioned by just the dimensions that apply to that variable. It is also possible to identify two main types of multi-cubes. Block multi-cubes use orthogonal dimensions so there are no special dimensions at the data level. A cube may consist of any number of the defined dimensions, and both measures and time are treated as ordinary dimensions, just like any other. Series multi-cubes treat each variable as a separate cube (often a time series), with its own set of distinct dimensions.
In general, multi-cubes are more versatile, but hyper-cubes are easier to understand. End-users relate better to hyper-cubes because of their higher level view. Multi-cubes provide greater tunability and flexibility. In addition, multi-cubes are a more efficient way of storing very sparse data.
It would be highly desirable to develop an OLAP technique that combines the conceptual benefits of hyper-cubes with the processing benefits (e.g., versatility, tunability, and flexibility) of multi-cubes.