The present invention relates generally to processing data in a data warehouse, and more specifically, to a method and system that allows a user to perform operations on subsets of data that are not defined by the structure of the data warehouse.
Contemporary database environments store and manage data, such as the sale and inventory of products sold by a retailer. Such a database environment may include a data storage system responsible for storing the data, with data encryption, data redundancy, backup facilities, data compression, or the like, as specified by an administrator. Similarly, the database environment may include a user interface that allows one or more users to access and process the data in the data storage system in a meaningful way (e.g., for preparing sales reports, inventory reports, etc.).
The representation of data displayed to a user may differ significantly from the logical or physical representation of the same data elsewhere within the database environment. The database environment thus serves, on the one hand, to store the data in an efficient and secure manner and, on the other hand, to process and present the data to a user in a meaningful way without overly encumbering the user.
Database environments typically include one or more data tables, i.e., logical arrays of data, with each data table containing a plurality of identically structured records. For example, in a data table relating to sales transaction data, the data table may include a plurality of rows, with each row containing the data associated with an individual record, for example, an individual sales transaction. In such cases, a row of data is synonymous with a record of data. Each column of the data table typically contains data relating to an aspect of a record. In the given example, each column contains data relating to an aspect of the respective sales transaction, for example, the date of the transaction, the time of the transaction, the location, the sold product, the sales price, etc. In cases where the data in a particular column is selected from a limited set of values, where the set of values is defined in the database environment (e.g., a set of product identification codes or date identification codes), the column and the data therein is typically termed a “dimension.” In cases where the data in a particular column is selected from an essentially continuous range of values (e.g., a sales price), the column and the data therein is typically termed a “measure.” Typically, the database environment will comprise a plurality of data tables, each relating to a different type of data, such as sales transactions, inventory, customers, etc.