Computer database servers are computers with centralized database storage arranged so that a number of users at remote client computers can simultaneously access the data in the database. The database server contains the necessary logic to ensure that the data does not become corrupted when two or more clients write to the database at the same time. One possible source of data corruption on a database server arises when two or more client users attempt to change the same set of data simultaneously. Without some sort of control mechanism, such access can easily corrupt the data.
Most database servers are based on database languages such as SQL. SQL was originally defined as a U.S. national standard by ANSI X3H2 in 1986[1]. Since that time, a number of revisions of standardized SQL have emerged. In addition, a large number of vendors have released database products based on standardized SQL or some variant of that language.
Database languages generally differ from traditional computer programming languages in that database languages provide language constructs to describe and manipulate data. Traditional programming language are typically much broader in scope and provide language constructs to display user data on a client screen and receive input from a keyboard, mouse, or other input device and in general manage the user interface. In addition, traditional computer programming languages contain language constructs to read and write data into storage devices such as computer file systems, tape drives, CD-ROM drives, and so forth. Examples of traditional programming languages include FORTRAN, BASIC, C, C++, FORTH, PASCAL, LISP, ADA, MODULA II, and assembly language for a large number of different computer systems. Database languages such as SQL, on the other hand, limit their language constructs to data definition, creation and manipulation, and specifically avoid providing constructs for user interface programming.
When personal computers were introduced in the late 1970's and early 1980's, a family of programming languages was introduced that contained the features of traditional programming languages such as C, FORTRAN, BASIC and PASCAL to manage user interfaces. These programming languages also included language constructs to provide database-like creation and manipulation commands based on computer files of the underlying operating system. Ashton Tate's dBASE is an early example of such a hybrid language. dBASE is a registered trademark of Borland International, a successor to Ashton-Tate. dBASE evolved into a generalized language known generically as Xbase. Xbase has been the subject of an ANSI standardization effort under the direction of ANSI X3J9 [2]. It is instructive to note that ANSI classifies SQL as a database language, contrasted with Xbase which is classified a programming language.
Computer languages and programming environments based on Xbase and its derivatives have proven very popular and are well suited for single computer applications as shown in FIG. 1. In such a system, an Xbase application program 1 is compiled against a traditional Xbase language environment 3 such as Clipper or dBASE. The resulting program is executed on the single personal computer 9, making changes to the Xbase file set 5 of the underlying DOS file system 7. Single computer environments avoid the problems introduced when multiple users are demanding simultaneous access to a common set of data.
However, it is a widely held belief that Xbase is inherently different from a database and is not suitable for true, mission-critical database processing for corporate data. Microsoft, a major computer software supplier, provides both Xbase products and SQL products, and instructs its customers that Xbase is suitable for local file manipulation, but that SQL should be used for remote database access.[3] Database experts have long held that Xbase is not a database language and should not be used for multi-user database applications. SQL is often described as a query language, while Xbase has a large number of non-query language constructs.
There are major architectural differences between Xbase and SQL. Xbase is based on known file formats, typically the so-called .DBF file format, and number of related index file formats. Xbase applications know and directly manage data files and associated index files. In SQL, the underlying file system and indexes are hidden from the application program which instead references logical database tables rather than physical operating system files. This distinction requires entirely different programming models for application programs.
Xbase and SQL differences are troublesome for information clients who desire to scan information that spans both Xbase data file sets and SQL databases. With the explosion of computer networking in general, and the internet in particular, information clients need to access data where it is found and in its native format. This is a different paradigm from a few years ago when it was customary to copy data sets of interest into the native format for a particular SQL system. Microsoft and other software vendors have addressed the need to span SQL and Xbase data by a single application by providing an programming language interface called ODBC [11] which allows an application program to access both Xbase data and SQL data from a single application program. Microsoft Access is an example of an ODBC product that supports ODBC. ODBC and related technologies provide a programming environment where differences between Xbase data file set programming and SQL database programming are minimized, but not entirely eliminated. One difference not eliminated by ODBC and related technologies is differences in transaction behavior for Xbase and SQL. ODBC supports transaction technologies only if the underlying system--SQL or Xbase--supports it. This means for traditional Xbase, there is no true transaction support.
With the introduction of computer networks for personal computers, the concept of a file server was deployed in which multiple clients can access generic computer files on a centralized server. Such a server protects multiple users from corrupting computer files. Traditional Xbase, running on a file server, is shown in FIG. 2. In FIG. 2, there are least three computers denoted by 25, 25' and 18. 25 and 25' denote client computers operating against file server computer 18. Note that although only two clients are shown, there can be an arbitrarily large number of clients. In FIG. 2, Xbase application programs 1 and 1' are compiled or interpreted by their respective traditional Xbase language environment 3 or 3' which translates Xbase commands into file system calls against the DOS file system API 11 and 11'. Transparent to application 1 and 1' the DOS file system API function calls 11 are converted into file system commands for the file system 17. The commands are delivered to the server computer via the network drivers 13, 13' and 15. In this system, the Xbase application programs 1 and 1' can run simultaneously, and the file system 17 will protect individual files in the Xbase file set 5 from corruption. However, a file server 18 cannot protect database corruption. Database corruption can occur even though the underlying files in the Xbase file set 5 that contain the database data are not individually corrupted. Database corruption is intolerable in a serious business application for obvious reasons.
To understand how a data files using the system of FIG. 2 can be corrupted even when the underlying files are individually not corrupted, the concept of a data transaction is helpful. By way of a highly simplified illustration, suppose a bank customer request that $10,000 be transferred from a savings account to a checking account. Suppose, further, that the bank maintains the checking account information in file C on its file system and its savings account information on file S. In order to complete the transfer, the following steps must take place:
Now suppose that when this banking transfer is being made, the computer system successful completes step 1, but before step 2 initiates, the computer system goes down for some reason. In this situation, both files C and S, as files, are not corrupted (i.e. both files contain meaningful data). However, the system as a whole is corrupted, because the customer now has a windfall gain of $10,000 and the banks data set is corrupted because it shows $10,000 more in its accounts than it actually has. Neither file C or S, standing alone, is corrupted; each is correct depending on whether the transfer is considered completed or not. As a set, however, the files are corrupted. Thus in such a situation, the database is corrupted, even though the underlying files are not. In database terminology, the database integrity is violated, even though file system integrity is preserved. Switching the order of steps does not solve the database corruption problem; it just changes the details of the corruption.
To address the form of database corruption illustrated by the simple banking illustration, the concept of a transaction has been introduced and has been widely known for many years. Transactions are an inherent feature of database languages such as SQL. In a transaction based system, steps 1 and 2 are combined by the application program into a single unit of work called a transaction. A simple rule but fundamental rule of transaction processing says that all steps of a transaction must be completed or none of the steps of the transaction must be completed. If a transaction is fully completed and reflected in the database, the transaction is said to be "committed". If the transaction does not complete, and any partially completed steps are reversed, the transaction is said to be "rolled back." Thus in our banking example, when the computer systems failed after completing step 1 but not step 2, at some point the computer system is restored to an operational state and partially finished transaction must either be committed or rolled back; generally the latter. Thus if transaction processing is in place, after such a failure, a rollback would be issued either by the administrator or automatically. The rollback would effectively undo step 1 by subtracting the $10,000 from account C, and the entire transaction could be reissued to complete the requested transfer.
Transaction processing is considered to be one of the most important distinctions that separates traditional programming languages from database languages. Transactions are well-defined in SQL but are not well defined in traditional Xbase. Xbase critics have long maintained that proper transaction semantics cannot be implemented in Xbase.
Xbase database servers such as the Advantage I database server have been introduced.[4] These database servers reside on the server computer and add performance benefits to file servers used in an Xbase environment. Unfortunately servers such as the Advantage I do not offer a solution to the database corruption described earlier.
Other products, such as ObjectDB[5] attempt to add commit/rollback transactional processing to a multi-user programming environment by including transactional logic on each client as shown in FIG. 3. FIG. 3 is similar to FIG. 2 except client transaction management means 19 is added. Also, the Xbase language environment 21 is enhanced over the traditional Xbase language environment 3 in that the Xbase language environment 21 provides a mechanism to define transactions by providing transactional programming commands to the user environment such as BEGIN TRANSACTION and COMMIT TRANSACTION and ROLLBACK TRANSACTION. With the Xbase language environment 21, programmers can explicitly identify transactions in the Xbase application program 1.
Although the client-based transaction management 19 and 19' illustrated in FIG. 3 appears to add transactional processing to Xbase, it suffers from a number of problems. Since there is no central point of control in the system illustrated in FIG. 3, each client 25 and 25' may modify any file in the Xbase file set 5 at will on the server 18, potentially corrupting the Xbase file set 5. If any one client 25 or 25' fails, then part or all of the transaction management is subject to failure, leading to database corruption. This problem is particularly acute in a modern PC based network, where client computers 25 and 25' are often powered down or reset by their users. Such users are often unaware that such action is corrupting the Xbase file set. In other words, a user who simply turns his or her desktop off at night before going home from work can inadvertently corrupt the Xbase file set 5 when using the client transaction management means 19 and 19' of FIG. 3. In such as system there is no central authority to detect the failure and "clean-up" any unfinished work. When a catastrophic failure of a client 25 or 25' occurs--i.e. the client machine is turned off without exiting an application program--each application that was in the process of updating the database when the failure occurred must be identified, and instructed to "clean up" the failed database. In the client-based transaction management system of FIG. 3, since each remaining client 25 and 25' generally has no knowledge of what other clients 25 and 25' were doing or that they even existed, it is difficult or impossible for a surviving client 25 or 25' to clean up after a failed peer client 25 or 25.
Furthermore, even if the failed client is eventually restored and determines that a failure occured, under the client-base transaction management approach of FIG. 3, since record locks are generally removed when a client fails, there is no mechanism to stop another client from writing to the records involved in the partially finished but failed transaction before the failed client gets a chance to undue the partial transaction. Furthermore, any client 25 or 25' that fails to behave properly either due to a unintentional programming bug, or an operator error or a power down can easily corrupt the Xbase file set 5.
What is needed is an Xbase programming environment that properly implements transaction management and prevents Xbase file corruption due to either client errors or other client failures.