A database is a collection of data. A user typically stores, retrieves and modifies the data in a database using a computer program, such as a database management system (DBMS).
One type of DBMS is a relational data base management system, which stores information in tables. A table is a series of intersecting rows and columns. The rows of a table typically represent records, which are collections of information about particular items, and the columns typically represent fields, which specify particular attributes of a record—e.g., a particular type of data that is contained in each field of the record. Each field contains the data having the particular attribute for the intersecting row and column.
The data stored in the tables of a relational database is commonly accessed and retrieved using a query and analysis tool. For example, a user can use the tool to perform specific operations on the tables, rows, and on individual data elements. One type of operation is an aggregation operation, and one such aggregation function is called a distinct count.
Online Analytical Processing (OLAP) generally refers to a category of software tools that provides analysis of data stored in a database. Users use the OLAP tools to analyze different dimensions of multimedia data, such as time series and trend analysis. In the business domain, OLAP is typically used to analyze transactional metrics, which are those metrics measured over an interval of time, and snapshot metrics, which are those metrics measured at a particular point of time. Common examples of these metrics can be found in the financial statements of companies, such as a profit and loss statement and a balance sheet. Other examples of snapshot measures include (1) number of open accounts by date for a bank, (2) number of open bugs by date for a quality application, (3) inventory by part for a supply chain application, (4) traditional interval measures with a known start date—e.g., revenue to date—can also be modeled as snapshot measures, etc.
These measures present unique challenges in data modeling. First, these measures are typically voluminous if stored completely, and its size can grow very rapidly. For example, even though the number of transactions for an organization that does no business for a year will not grow, its snapshots will continue to have records added to them with the passing of time. Second, the added size of the number of records usually adds to query time, as well as to extract, transform and load (ETL) times. ETL is a process in data warehousing that involves extracting data from outside resources, transforming it to fit business needs, and loading it into a data warehouse. Third, the measures are not typically cumulative. Although databases suitably handle sum, counts and averages, non-cumulative measures require special handling by the OLAP tools.
Conventional approaches are not yet able to efficiently model snapshots.