This invention relates to the field of database software generally, and specifically to software applications for analyzing data in a database. A database is typically one or more large sets of structured data. A database is usually associated with a software application adapted to query and update data in the database. A common type of database structure is a relational database. A relational database organizes data and the relationships between data in a set of tables, typically two-dimensional tables organized into rows and columns. SQL, a programming language defining the creation and manipulation of tables, is typically used by database applications to create, update, and query the database.
Relational databases are well suited large databases and for quickly processing database queries. Because of this, relational databases are often used for on-line transaction processing (OLTP) applications, which often require handling millions of transactions a day, with each transaction being processed in real-time or near real-time.
In addition to processing transactions, databases can also be used to perform complex data analysis tasks. Although relational databases perform transaction processing applications efficiently, they are typically very inefficient at transforming or processing large amounts of raw data with analytical functions used for data analysis. Because of this, another type of database structure, known as On-Line Analytical Processing (OLAP), is used for data analysis applications.
OLAP databases enable users to analyze the data and look for patterns, trends, and exceptions. Whereas relational databases use tables and columns to organize their data, OLAP databases generally use dimensions and cubes as their central data structures. Cubes are simply datapoint items (e.g. Profit, Cost). Dimensions are data structures that can specify a hierarchy of items. Examples of dimensions can include things like “Time” and “Geography,” for which “Time” might include a hierarchy of (Year, Quarter, Month) and “Geography” might specify a hierarchy of locations, such as (Country, Region, City).
Dimensions are well adapted to allow users to define these analytic calculations. An OLAP database or analysis tool can directly support many types of calculations because it knows the relationship between the items specified by dimensions. For a relational database, analysis is more difficult because data is stored as a group of unrelated columns.
In order provide better analytical capabilities in relational databases without sacrificing performance, data analysis software, such as Oracle Discoverer, have been developed. The data analysis software provides a graphical user interface for analyzing data in a relational database. Users can quickly create, modify, and execute ad-hoc queries, reports, and graphs, using the data analysis software. The data analysis software translates user input from the graphical user interface into specially-created SQL analytic functions, such as those enabled in Oracle 8i. The SQL analytic functions generically partition rows based on columns and compute the functions within those row sets. The SQL statements formulated by the data analysis application are then processed by the database, and the results are displayed in the data analysis application. In this manner, the data analysis application provides relational databases users with “OLAP-type” analysis capabilities.
The functionality introduced by the SQL analytic functions do not, in and of itself, solve the calculation requirements for data analysis software. It is essential that the data analysis tools are easy to use and understand by business users, who do not typically understand the usage of SQL. Data analysis software can present data to users in the form of tables or sheets having cells arranged into rows and columns. User can rearrange the cells on a sheet, or perform filtering or pivot table operations to create different view of data in the database.
A layout specifies the relationship between the cells of the sheet and the data in the database. Typically, SQL statements are associated with the cells for retrieving and processing data from the database. As users change the layout on a sheet, the associated SQL statements often “break” from their intended functionality. This occurs most often with SQL analytic functions, which rely on complicated data partitioning to perform computations. This results in data results that is either invalid or does not reflect the intentions of the user.
Thus, it is desirable for the data analysis software to form correct SQL statements regardless of the layout of cells on a sheet. It is further desirable that users be able to specify complex analytical function on a sheet without having to understand SQL.