1. Field of the Invention
This invention relates in general to database management systems performed by computers, and in particular, to data visualization of queries over joins.
2. Description of Related Art
Databases are computerized information storage and retrieval systems. A Relational Database Management System (RDBMS) is a database management system (DBMS) which uses relational techniques for storing and retrieving data. Relational databases are organized into physical tables which consist of rows and columns of data. The rows are formally called tuples. A database will typically have many physical tables and each physical table will typically have multiple tuples and multiple columns. The physical tables are typically stored on random access storage devices (RASD) such as magnetic or optical disk drives for semi-permanent storage. Additionally, logical tables or xe2x80x9cviewsxe2x80x9d can be generated based on the physical tables and provide a particular way of looking at the database. A view arranges rows in some order, without affecting the physical organization of the database.
RDBMS software using a Structured Query Language (SQL) interface is well known in the art. The SQL interface has evolved into a standard language for RDBMS software and has been adopted as such by both the American National Standards Institute (ANSI) and the International Standards Organization (ISO).
The SQL interface allows users to formulate relational operations on the tables either interactively, in batch files, or embedded in host languages, such as C and COBOL. SQL allows the user to manipulate the data. The definitions for SQL provide that a RDBMS should respond to a particular query with a particular set of data given a specified database content, but the method that the RDBMS uses to actually find the required information in the tables on the disk drives is left up to the RDBMS. Typically, there will be more than one method that can be used by the RDBMS to access the required data. The RDBMS will optimize the method used to find the data requested in a query in order to minimize the computer time used and, therefore, the cost of performing the query.
One SQL operation is a join operation. A join operation combines information from more than one table or view by appending information from one table or view to the information in the other. Rows or portions of rows from different tables or views are concatenated along the row (e.g., if a row of a first table contains xe2x80x9cabcxe2x80x9d and a row of a second table contains xe2x80x9cxyzxe2x80x9d, the join results in a row containing xe2x80x9cabc xyzxe2x80x9d). The join operation is implied in a SQL statement (i.e., there is no join keyword) by selecting data from multiple tables. Typically, a join operation has the following form:
SELECT * FROM DEPARTMENT, EMPLOYEE
WHERE DEPT_NUMBER=EMP_DEPT
The WHERE clause in the above SELECT statement indicates the columns (i.e., with DEPT_NUMBER and EMP_DEPT) through which the tables to be joined (i.e., DEPARTMENT and EMPLOYEE) are linked. DEPT_NUMBER is a primary key of the DEPARTMENT table, and EMP_DEPT is a foreign key of the EMPLOYEE table. A primary key and a foreign key, although having different names, contain the same information (i.e., department numbers), and so they can be used to link two tables.
When the join operation does not specify a conditional clause, the result is a Cartesian product, and the resulting table contains all possible combinations of rows from the joined tables. This result may contain redundant information.
The result of a join operation is typically a large, flat table that consists of rows from two tables. Some of the information may be redundant. Therefore, the join operation may result in a view of data that is not easily understood. Additionally, to join tables, current systems require that a user know of the tables that are to be joined. If a user does not know which tables may be joined, the user may not be able to perform useful join operations. Moreover, once a join is performed and the results of the join displayed, if a user wishes to perform another join or return to the original table, the user must submit another request. This may be time consuming and is not user-friendly.
Additionally, a variety of mobile devices such as PalmOS, Windows CE handheld devices, various embedded systems, and smart card, utilize a RDBMS for storing and retrieving data. These types of mobile devices have become very popular and are increasingly being used by a wide spectrum of people. Additionally, these mobile devices may contain a database. Unfortunately, these small devices have limited memory, a small display, and operate at slow speeds.
Because, the result of a join operation is typically a large, flat table and because some of the information may be redundant, the join operation may take up a great deal of memory in a small device, and, some joins may not be performed successfully. Moreover, due to the slower speed of the small device, performing a join operation on a small device is time consuming. Moreover, the flat structure is not easily displayed on a small device.
Therefore, there is a need in the art for improved data visualization for queries with joins.
To overcome the limitations in the prior art described above, and to overcome other limitations that will become apparent upon reading and understanding the present specification, the present invention discloses a method, apparatus, and article of manufacture for data visualization of queries over joins.
According to an embodiment of the invention, one or more commands are executed in a computer to perform a database operation on a relational database stored on a data store connected to the computer. Initially, a selection of a join indicator associated with a row of a table in the relational database is received. It is determined whether to perform a table level join operation or a row level join operation. The determined join operation is performed.