The advent of databases and e-commerce requires the ability to request services from a variety of databases without knowing the exact implementation of the database or of the statements used to request the services. These request statements are made in a non-procedural programming language that does not provide an explicit implementation. Instead, the developers of particular databases or non-procedural programming language statements provide proprietary implementations for the statements rendered in the non-procedural language.
Structured Query Language (“SQL”) is an illustrative example of a non-procedural language. SQL differs from a procedural language like FORTRAN in that it does not specify how a particular request is carried out, but instead allows the database manager to provide the relevant details. Thus, a command in SQL merely states a request and not how it is carried out.
SQL includes: a Data Development Language (“DDL”) for creating databases and data structures, but not necessarily data itself; a Data Manipulation Language (“DML”) facilitating database maintenance and actual operations on data; and a Data Control Language (“DCL”) for specifying security requirements. Some examples of SQL commands include the DDL commands CREATE, ALTER and DROP, DML statements and functions such as INSERT, UPDATE, DELETE, SELECT, COUNT, SUM and the like, and DCL commands such as COMMIT, ROLLBACK, GRANT and REVOKE.
SQL permits interactions with a database in an atomic manner, i.e. only one user may access a unit of data, to prevent other users from changing the database between operations constituting a transaction. The code used to implement these commands and functions is the responsibility of the database developer or vendor. Of course, universal support for SQL commands ensures that any user can access and use a SQL compliant database regardless of the database vendor and particular implementation details.
SQL commands such as COMMIT and ROLLBACK are of interest in an exemplary embodiment of the invention. These SQL commands protect a database against inadvertent corruption. To this end the database itself is not affected until the COMMIT command is given. If an error occurs then a ROLLBACK command restores the state of the system to that at the conclusion of the previous COMMIT command. A transaction is terminated by either a COMMIT command or ROLLBACK command combined with allowing other users access to the data. A ROLLBACK command requires buffering of all operations following a COMMIT command to permit restoration of the state following the COMMIT command.
If the transaction fails or a user cancels a transaction, a ROLLBACK results in clearing the buffered operations and removing access restrictions to restore the database to its state prior to the initiation of the now failed transaction. On the other hand, a COMMIT command results in updating the database followed by clearing of the buffered operations.
Another SQL command, SAVEPOINT, enables restoring the system to an earlier defined state that need not be the state at the conclusion of the previous COMMIT command. Like the COMMIT command in the context of the ROLLBACK command, SAVEPOINT provides a prior state of the system for the ROLLBACK command. Unlike the COMMIT command, however, the SAVEPOINT command does not require changes to the database. Instead SAVEPOINT enables specification of a defined state for system restoration. In some embodiments the SAVEPOINT command specifies multiple earlier states distinguished by their respective identifiers. If desired, the system can be restored to one of the specified earlier states by executing a ROLLBACK to the specified state. If a COMMIT command is given then all buffered operations are cleared along with the states specified by the SAVEPOINT command.
Implementing the SAVEPOINT or ROLLBACK commands requires considerable overhead since other commands must therefore provide buffering. On the other hand, it is not necessary to support buffering if the SAVEPOINT or ROLLBACK commands are not used. A typical application includes SQL statements in several files and a compiler compiles only one file at a time. Thus, it is not possible to decide when compiling a particular file whether buffering-related code is needed due to a statement in another file.
SQL applications written using SQL statements and functions can be combined with source code in a programming language such as C++ in Embedded SQL (“ESQL”). An ESQL application can include several source code files. The source files for an ESQL application are preprocessed by a macro-preprocessor. Typically, the macro-preprocessor generates code for the various embedded SQL statements or introduces additional statements followed by a compiler compiling the output of the macro-preprocessor. Compiling a source file generates an object module corresponding to the source file. The linker links object modules to generate the executable program.
Compiling a source code file includes several operations. A compiler parses the source code, carries out several checks to ensure conformity with the programming language specifications and then translates the parsed code to generate a lower level code such as machine code for execution on a computer. In some instances, the code is assembly or byte code that needs further translation for actual execution on a particular computer. A compiler allocates memory for each variable to properly translate source code to generate executable code. The compiler allocates memory for each variable in accordance with a “type” specification for the variable in question.
Type information is specified in a “declaration” statement. Each variable is assigned a particular type. The compiler enters the type information for each variable into a symbol table associated with an object module. When several object modules in the same executable share a variable it is important to ensure that only one module actually allocates memory for the variable. The compiler allocates memory in response to a “definition” statement for a particular variable. However, the declaration and/or definition statements are allowed to be implicit in many programming languages.
The “C” programming language permits an “extern” declaration in a source file that tells the compiler that memory for the specified variable is allocated in another file. Consequently, a C compiler only creates a variable entry in the symbol table that serves as a place holder for the variable but leaves the actual memory allocation to another file. The variable merely points to its entry in the symbol table and is redirected to the actual memory allocation following identification of the intended memory location. Thus, there are several declarations for a variable but there can be only one definition. No value can be assigned to a variable unless the variable is defined because there is no memory allocated to store it.
Following compilation, a linker links the resultant object files to generate the executable for the application. The linking may be static or dynamic. In static linking the object files identified by the linker for the resolution of all variables are copied to generate an executable file. In contrast, dynamic linking allows fetching an object file at either load time or at runtime. Consequently, the same object module is used by several applications. As is evident, typically dynamic linking results in lower memory requirements and smaller executable sizes. Furthermore, a programmer can modify and recompile a dynamically-linked module independent of another module, thus making software maintenance easier and less expensive.
Declaring a variable with an “extern” keyword requires the linker to identify the actual memory allocated for the variable in other object modules. To this end the linker searches symbol tables associated with object modules or libraries for a module providing a definition for the variable in question. This process is termed resolving the variable. Proper resolution of a variable is required before it can actually be used in an executable file.
In software development projects a software application is refined over the life of the project. Through the development process, concepts concerning various problems and solutions are often revised, and the functions and features of the final software application are often quite different from those at the beginning of the project. Support for additional features supporting execution of other statements in a non-procedural language statement reduces the execution efficiency of programs that do not use these additional features. On the other hand, adding distinct commands to provide the additional features results in complex programming languages with many statements differing only in the context in which they should be used. For example, if there is at least one command that requires buffering prior changes to a database in an SQL-based application, then implementations of other commands affecting the database need to support buffering. On the other hand, if no command requiring buffering is used in an application then the program overhead for buffering unnecessarily slows down the application.
As a programming language evolves to develop specific commands for a particular context, developers have to learn different commands for accomplishing similar tasks rather than preserving their existing familiarity with the programming tool. Similar sounding commands that differ in subtle but significant details increase the risk that a programmer inadvertently uses the less effective command. Such errors are difficult to identify since some may only sporadically result in bugs. Therefore, it is desirable to have a system and method for providing contextually efficient implementations for a programming language command that can be invoked automatically without requiring the programmer to use different commands to invoke optimized implementations for different contexts.