1. Field of the Invention
This invention relates to the general field of input/output (I/O) operation within a computer and in particular to the encryption of data transferred to an I/O device.
2. Description of the Related Art
As computers have proliferated in both offices and homes, the need for data security has increased. Not only do large entities worry about protecting confidential information against unauthorized access, but as more and more computers—even in homes—are constantly connected to the Internet via “always on” broadband devices, even the computers of private individuals have become vulnerable to unauthorized penetration and probing.
Firewalls are a common means of protection, but even they are not always impenetrable. Even if they were, there may still be those inside the firewall who are not authorized to access certain data stored on either the computers they work with or on some shared storage medium that they can access.
The most reliable way to protect sensitive data is to encrypt it. Accordingly, several encryption mechanisms have been proposed. One known technique requires specialized, dedicated hardware to be installed either as an interface between a hard disk controller and all sub-systems that access the disk, or as part of the disk controller itself—all data to be written to the disk is first encrypted in hardware. This solution, although fast, has drawbacks. Most obvious is the requirement to install additional hardware, which will often prove to be impractical, expensive, inconvenient or impossible depending on the system in which it is to be included. Another disadvantage is lack of flexibility, since the encryption device may not be able to work with all different disks or disk controllers; moreover, it is difficult to change or upgrade, for example, the encryption algorithm when encryption is implemented in hardware.
Many software-based encryption solutions are also known. A well-known problem with software-based encryption techniques, however, is that encryption and decryption impose additional software overhead, primarily the CPU cycles consumed by the computationally intensive cryptographic operations. Other sources of overhead may include memory system effects, such as additional cache or TLB (translation lookaside buffer) misses. Such overhead may reduce I/O throughput and increase I/O latencies, reducing overall system performance.
Existing software-based encryption tends to be “all or nothing,” albeit often at the level of applications or files. In many cases, however, a user does not need to encrypt all of the data stored on a disk. For example, the program code defining commodity operating systems such as Linux or Windows and common applications such as Mozilla or Microsoft Office usually does not need to be encrypted even if the data the code generates may need to be. As another example, there will in most cases be no need to encrypt the code defining a financial management program, whereas it might be highly desirable to encrypt the user-entered financial data that is stored as a result of running the program.
One way to avoid all-or-nothing encryption is to require the user to specify which files are to be encrypted, when the files are not in use. In other words, the user opens the encryption software and specifies which files it is to operate on. The software then encrypts the selected files. This is not only a cumbersome process, but it is not always even obvious to users which files contain the sensitive data. Yet another problem with explicitly specifying files is that the granularity is poor—like whole-disk, whole-file is often too large of a unit. For example, it may be desirable to encrypt only small private changes to a large database file containing mostly public data. Furthermore, sub-file granularity is difficult for end-users to specify.
An additional problem associated with encrypting non-sensitive data is that it precludes (or at least complicates) optimizations that attempt to eliminate duplicate copies by sharing a single instance of common data. This is because different users will use different encryption keys, so that data which is identical before encryption will be completely different after encryption.
What is needed is therefore a software mechanism for I/O data encryption (including data to be written to a disk) that avoids incurring the overhead associated with needlessly encrypting and decrypting “non-sensitive” data, such as the commonly available data (including the concept of code) associated with a fresh install, on every subsequent access to that data. The mechanism should also be either easy for a user to control, or not require user control at all. This invention provides such a mechanism in the form of a system of software modules and a method for operating them.