1. Field of the Invention (Technical Field)
The present invention relates to the field of database or file accessing, and more particularly to efficiently sorting or searching an encrypted database while it remains encrypted.
2. Description of Related Art
Stored data frequently requires privacy, either as a result of regulatory obligations, or for business reasons. Laws, such as HIPAA, and banking regulations, require secure treatment of personal information. Business information, such as trade secrets, must generally be kept hidden from competitors.
Currently available database systems are unable to provide adequate protection, as most of these systems store the database on disk in unencrypted form. Theft of confidential personal or business information typically occurs from unencrypted databases, often stored on an easily stolen laptop computer, or on an electronically compromised server. This exposes the owner of the stolen data to liability, as well as exposing the personal information in the database to the thief, often resulting in identity theft.
Encrypting databases can provide a superior way of preventing data loss, even if the data storage system is stolen, or the server is electronically penetrated, so long as the data remains in an encrypted form. This process, however, typically leads to operational problems, mostly related to sorting and searching the database.
Sorting and searching an encrypted database typically requires decrypting the data, either by decrypting the entire database, or decrypting some, or all, of the data “on the fly”. This requires significant computational overhead, and exposes at least some of the data in unencrypted form.
Those few systems which do encrypt the database typically decrypt it on disk when the first user opens the database, and encrypt it again when the last user closes the database. This leaves the data on the file system in unencrypted form for as long as the system is in use. In many on-line financial processing and banking applications this may result in data being continuously exposed in unencrypted form on external storage media, 24 hours a day, 7 days per week. Depending on the storage system, fragments of temporary files containing unencrypted data may also remain exposed on external storage until those storage fragments are re-allocated by the operating system and over-written.
If unencrypted data is stored on a disk it is easily “readable” by a thief. If the system “crashes”, is put in “hibernate” mode, or is improperly shut down, the data on disk will remain in unencrypted form, and thus, will be vulnerable to theft. Operating systems using virtual memory present a problem, since memory page files are written to disk. This exposes any unencrypted data that was held in RAM to discovery by a clever thief. This implies that a minimal amount of unencrypted data should be kept in protected working memory, (i.e. RAM, or internal registers), by the database system.
The present invention is designed to minimize the exposure of unencrypted data, while keeping computational overhead to a minimum. The strategy used is to keep the data encrypted as much as possible, to minimize exposure of the data in unencrypted form, and to never store the unencrypted data on external storage, i.e. disk.
Other approaches have severe limitations that render them not as generally useful as the present invention. An order preserving cryptographic algorithm is difficult, or impossible, to implement, due to the order destroying requirements of a cryptographic system. While some progress has been made in creating such an algorithm (see, e.g., U.S. Patent Publication No. 2005/0147240), it is not currently believed to be general enough, or of sufficient strength, to make it adequate for high security applications.
The present invention does not suffer from this sort of cryptographic weakness, as the invention is implemented using a cryptographic plug-in, which may be optimized for security.
On The Fly Encryption (OTFE) is the process of encrypting and decrypting all of the data on a disk, either through hardware or firmware, (in the disk subsystem), or through software, (in the disk device driver) as the data is read or written.
Overhead, in either hardware or software implementations of OTFE, is much greater than that incurred with the present invention. A hardware solution is also generally more expensive than a software solution, and is often more difficult to implement in an existing system.