1. Field of the Invention
The present invention generally relates to data processing in databases and more particularly to preventing alterations of data in a database when a query is executed against the database.
2. Description of the Related Art
Databases are computerized information storage and retrieval systems. A relational database management system is a computer database management system (DBMS) that uses relational techniques for storing and retrieving data. The most prevalent type of database is the relational database, a tabular database in which data is defined so that it can be reorganized and accessed in a number of different ways.
Regardless of the particular architecture, in a DBMS, a requesting entity (e.g., an application or the operating system) demands access to a specified database by issuing a database access request. Such requests may include, for instance, simple catalog lookup requests or transactions and combinations of transactions that operate to read, change and add specified records in the database. These requests are made using high-level query languages such as the Structured Query Language (SQL). Illustratively, SQL is used to make interactive queries for getting information from and updating a database such as International Business Machines' (IBM) DB2, Microsoft's SQL Server, and database products from Oracle, Sybase, and Computer Associates. The term “query” denominates a set of commands for retrieving data from a stored database. Queries take the form of a command language that lets programmers and programs select, insert, update, find out the location of data, and so forth.
One of the issues faced by data mining and database query applications, in general, is their close relationship with a given database schema (e.g., a relational database schema). This relationship makes it difficult to support an application as changes are made to the corresponding underlying database schema. Further, the migration of the application to alternative underlying data representations is inhibited. In today's environment, the foregoing disadvantages are largely due to the reliance applications have on SQL, which presumes that a relational model is used to represent information being queried. Furthermore, a given SQL query is dependent upon a particular relational schema since specific database tables, columns and relationships are referenced within the SQL query representation. As a result of these limitations, a number of difficulties arise.
One difficulty is that changes in the underlying relational data model require changes to the SQL foundation that the corresponding application is built upon. Therefore, an application designer must either forgo changing the underlying data model to avoid application maintenance or must change the application to reflect changes in the underlying relational model. Another difficulty is that extending an application to work with multiple relational data models requires separate versions of the application to reflect the unique SQL requirements driven by each unique relational schema. Yet another difficulty is evolution of the application to work with alternate data representations because SQL is designed for use with relational systems. Extending the application to support alternative data representations, such as XML, requires rewriting the application's data management layer to use non-SQL data access methods.
A typical approach used to address the foregoing problems is software encapsulation. Software encapsulation involves using a software interface or component to encapsulate access methods to a particular underlying data representation. An example is found in the Enterprise JavaBean (EJB) specification that is a component of the Java 2 Enterprise Edition (J2EE) suite of technologies. In the case of EJB, entity beans serve to encapsulate a given set of data, exposing a set of Application Program Interfaces (APIs) that can be used to access this information. This is a highly specialized approach requiring the software to be written (in the form of new entity EJBs) whenever a new set of data is to be accessed or when a new pattern of data access is desired. The EJB model also requires a code update, application build and deployment cycle to react to reorganization of the underlying physical data model or to support alternative data representations. EJB programming also requires specialized skills, since more advanced Java programming techniques are involved. Accordingly, the EJB approach and other similar approaches are rather inflexible and costly to maintain for general-purpose query applications accessing an evolving physical data model.
Another shortcoming of the prior art, is the manner in which information can be presented to the user. A number of software solutions support the use of user-defined queries, in which the user is provided with a tool to construct a query that meets the user's specific data selection requirements. In an SQL-based system, the user is given a list of underlying database tables and columns to choose from when building a query. The user must decide which tables and columns to access based on the naming convention used by the database administrator. This approach does not provide an effective way to subset the set of information presented to the user. As a result, even nonessential content is revealed to the user.
Further, existing database environments do not effectively accommodate multiple users desiring to access different portions of the same physical data simultaneously. Typically, such simultaneous access is accomplished by each user using a separate application. Each application is written to expose the desired data to the respective users. Accordingly, substantial overhead in application development results in today's database environments.
Assuming a solution to the foregoing difficulties, another issue which must be addressed is the manner in which simultaneous queries and updates on shared data are handled in a database environment. A typical approach used to address this problem is the introduction of transaction processing environments. Transaction processing is the management of discrete units of work that access and update shared data. A unit of work is a sequence of associated operations which transforms a consistent state of a recoverable resource into another consistent state. For instance, the business function of a commercial application program typically involves processing many similar items, for example orders in an order processing system or seat reservations in an airline booking system. The processing of one of these items, i.e., the execution of this discrete unit of processing, is a transaction. A unit of work must either be fully completed and committed at completion of the transaction, or fully purged without action, i.e., rolled back. When a transaction is committed, all changes made by the associated requests are made permanent.
In a typical transaction processing environment, many users repeatedly process similar transactions. Transaction processing is particularly effective for the processing of unscheduled single items in unpredictable volumes and sequence, for changing data in a database to reflect updates as they are processed, and for providing immediate on-line access to data that has been updated to reflect all previous transactions.
Access to and update of shared resources with integrity is one of the basic functions and characteristics that are usually required of transaction processing systems. Very large numbers of transactions may read or cause changes to a database or data warehouse. It is important that transactions read valid data and that their updates are correctly recorded. Ensuring this is called maintaining data integrity.
Maintaining data integrity is performed in a typical transaction processing environment by specifying transaction properties and data isolation based on some physical model. For example, a particular database connection has an isolation level, and every access to any data by that connection locks the data appropriately, or an application has an isolation level and all access by that application locks the data appropriately, or an application or transaction monitor chooses the isolation level for a single SQL statement. However, both important and unimportant data are accessed by the application or connection and are locked in the same way.
Therefore, there is a need for an improved and more flexible transaction paradigm for maintaining data integrity in a database preventing alterations of queried data when a query is executed against the database.