(1) Field of the Invention
The present invention relates generally to the transfer of data between computing systems and more specifically to maintaining the integrity of those transfers.
(2) Background Art
The transfer of data between different computing systems may be critical in the correct operation of those systems. For example, an electronic funds transfer from a checking to a savings account requires two data transfer steps. The first step is the reduction of the balance in the checking account and the second step is to pass the amount to the savings account for addition. If the computer system fails, and the transaction is interrupted between steps, then the funds would be lost.
Potential failures include “dirty reads” and “dirty writes”. Dirty reads occur when read and write operations occur simultaneously on the same data. If a write operation is executed during a read operation, the data read can incorrectly be a mixture of the data that existed before and after the write. Dirty writes occur when a write operation is not completed, potentially leaving data with an incorrect mixture of new and old values.
“ACID” protocols can be used to ensure the integrity of transactions and to avoid any potential failures. ACID is an acronym that stands for Atomicity, Consistency, Isolation, and Durability. Atomicity is achieved by ensuring that multi-step processes are executed as a single element. In other words, no single step is committed until all steps are completed. If a failure does occur, durability is achieved by including failure tracking and the ability to rollback the system to a previous state. Isolation is achieved by requiring that read and write operations are complete before their effect can be seen by the rest of the system. Consistency implies that operations on data generate consistent and reproducible results.
The ACID protocols can be implemented to varying degrees. Adherence to the strictest protocols creates a significant overhead and can appreciably slow transactions. Systems that require rapid data transfer rates, therefore, cannot take full advantage of ACID protocols. However, less than the full set of ACID protocols can still ensure the integrity of certain types of transactions.
“Data systems” are software programs designed to process and store digital data. All data systems discussed in this document are considered to ensure the integrity of data, such as by using ACID protocols. Data systems include relational database systems that are available from companies such as Oracle, IBM, Microsoft, and Informix.
In addition to atomicity, consistency, isolation, and durability, the advantages of commercially available data systems can include logging, versioning, access control, and simple metadata manipulation. Disadvantages, however, include reduced transaction speeds, cumbersome access interfaces (ODBC and SQL Queries), and a lack of important text handling utilities in most commercial systems.
In contrast, “File Systems,” such as those associated with UNIX®, Windows NT®, OS/2® and Linux®, usually do not follow stringent ACID protocols. For example, although files may be locked to prevent simultaneous reads and writes (isolation), there is normally no method for preventing dirty writes (durability), a problem that is exacerbated by the lack of a rollback recovery system. Additionally, many file systems first store data into volatile memory before writing to long term storage. Data, therefore, is subject to the stability of the volatile memory. File systems also lack simple means for associating metadata with files and for logging operations.
File systems are specifically designed for file management and include storage structures appropriate for standard storage devices. Since they are typically integrated into a computer's operating system they can have direct control of device drivers and are, therefore, optimized for rapid data transfer. File handling utilities, such as text search and virus filters, are also readily available. Data can be logically distributed over directories, physical storage devices, or computer systems and is stored in a manner that permits easy access and manipulation by other software. A file system may include a multitude of physical storage locations and devices.
The differences between data systems and file systems are especially significant with communication between computers and external networks. The structure of a network greatly increases the probability that a process will be interrupted, an issue that has grown in importance with the development of large computer networks such as the Internet.
When data are transferred between computing devices they are typically received by either data system or file system software. Developers must choose between the advantages and disadvantages of each.
There is a significant need for a system that overcomes the disadvantages of the prior art.