Obtaining a guarantee that a given code has executed untampered on an untrusted legacy computing platform has been an open research challenge. We refer to this as the problem of verifiable code execution. An untrusted computing platform can tamper with code execution in at least three ways: 1) by modifying the code before invoking it; 2) executing alternate code; or 3) modifying execution state such as memory or registers when the code is running.
Verifiable Code Execution
Two techniques, Cerium [B. Chen and R. Morris. Certifying program execution with secure procesors. In Proceedings of HotOS IX, 2003] and BIND [E. Shi, A. Perrig, and L. van Doorn. Bind: A fine-grained attestation service for secure distributed systems. In Proc. of the IEEE Symposium on Security and Privacy, pages 154-168, 2005], have been proposed to address the problems of verifiable code execution. These use hardware extensions to the execution platform to provide a remote host with the guarantee of verifiable code execution. Cerium relies on a physically tamper-resistant CPU with an embedded public-private key pair and a μ-kernel that runs from the CPU cache. BIND requires that the execution platform has a TPM chip and CPU architectural enhancements similar to those found in Intel's LaGrande Technology (LT) [Intel Corp. LaGrande Technology Architectural Overview, September 2003] or AMD's Secure Execution Mode (SEM) [AMID platform for trustworthy computing. In WinHEC, September 2003] and Pacifica technology [Secure virtual machine architecture reference manual. AMD Corp., May 2005]. Unlike the present invention, neither Cerium nor BIND can be used on legacy computing platforms.
Intel's LaGrande Technology (LT) [Intel Corp. LaGrande Technology Architectural Overview, September 2003] and AMD's Secure Virtual Machine (SVM) extensions [AMD64 Architecture Programmer's Manual Volume 2: System Programming, Rev 3.11, December 2005] are also hardware-based technologies that can be used to obtain the guarantee of verifiable code execution. Unlike the present invention, however, both these technologies are not suitable for legacy computers since they require CPU architecture extensions and a cryptographic co-processor in the form of a Trusted Platform Module (TPM) chip.
Memory Integrity Verification
Techniques for memory integrity verification allow a remote verifier to check the memory contents of an untrusted computer to detect the presence of malicious changes. Memory integrity verification provides a strictly weaker property than verifiable code execution. The verifier can only obtain the guarantee that the code present in the memory of an untrusted computer is unmodified but cannot obtain the guarantee that the correct code will execute untampered on the untrusted computer. Prior work in the area of memory integrity verification can be classified into hardware-based and software-based approaches.
Hardware-Based Techniques.
Sailer et al. describe a “load-time attestation” technique that relies on the TPM chip standardized by the Trusted Computing Group [R. Sailer, X. Zhang, T. Jaeger, and L. van Doom. Design and implementation of a TCG-based integrity measurement architecture. In Proceedings of USENIX Security Symposium, pages 223-238, 2004]. Their technique allows a remote verifier to verify what software was loaded into the memory of a platform. However, a malicious peripheral could overwrite code that was just loaded into memory with a DMA-write, thereby breaking the load-time attestation guarantee. Also, as discussed herein, the load-time attestation property provided by the TCG standard is no longer secure since the collision resistance property of SHA-1 has been compromised.
Terra uses a Trusted Virtual Machine Monitor (TVMM) to partition a tamper-resistant hardware platform in multiple virtual machines (VM) that are isolated from each other [T. Garfinkel, B. Pfaff, J. Chow, M. Rosenblum, and D. Boneh. Terra: A virtual machine-based platform for trusted computing. In In Proceedings of ACM Symposium on Operating Systems Principles (SOSP), 2003]. CPU-based virtualization and protection are used to isolate the TVMM from the VMs and the VMs from each other. Although the authors only discuss load-time attestation using a TPM, Terra is capable of performing run-time attestation on the software stack of any of the VMs by asking the TVMM to take integrity measurements at any time. All the properties provided by Terra are based on the assumption that the TVMM is uncompromised when it is started and that it cannot be compromised subsequently. Terra uses the load-time attestation property provided by TCG to guarantee that the TVMM is uncompromised at start-up. Since this property of TCG is compromised, none of the properties of Terra hold. Even if TCG were capable of providing the load-time attestation property, the TVMM could be compromised at run-time if there are vulnerabilities in its code.
In Copilot, Petroni et al. use an add-in card connected to the PCI bus to perform periodic integrity measurements of the in-memory Linux kernel image [N. Petroni, T. Fraser, J. Molina, and W. Arbaugh. Copilot—a coprocessor-based kernel runtime integrity monitor. In Proceedings of USENIX Security Symposium, pages 179-194, 2004]. These measurements are sent to the trusted verifier through a dedicated side channel. The verifier uses the measurements to detect unauthorized modifications to the kernel memory image. The Copilot PCI card cannot access CPU-based state such as the pointer to the page table and pointers to interrupt and exception handlers. Without access to such CPU state, it is impossible for the PCI card to determine exactly what resides in the memory region that the card measures. The adversary can exploit this lack of knowledge to hide malicious code from the PCI card. For instance, the PCI card assumes that the Linux kernel code begins at virtual address 0xc0000000, since it does not have access to the CPU register that holds the pointer to the page tables. While this assumption is generally true on 32-bit systems based on the Intel x86 processor, the adversary can place a correct kernel image starting at address 0xc0000000 while in fact running a malicious kernel from another memory location. The authors of Copilot were aware of this attack [W. Arbaugh. Personal communication, May 2005]. It is not possible to prevent this attack without access to the CPU state. Also, if the host running Copilot has an IOMMU, the adversary can re-map the addresses to perform a data substitution attack. When the PCI card tries to read a location in the kernel, the IOMMU automatically redirects the read to a location where the adversary has stored the correct copy.
The kernel rootkit detector we build using the present invention is able to provide properties equivalent to Copilot without the need for additional hardware. Further, because our rootkit detector has access to the CPU state, it can determine exactly which memory locations contain the kernel code and static data. This ensures that our rootkit detector measures the running kernel and not a correct copy masquerading as a running kernel.
Software-Based Techniques.
Genuinity is a technique proposed by Kennell and Jamieson that explores the problem of detecting the difference between a simulator-based computer system and an actual computer system [R. Kennell and L. Jamieson. Establishing the genuinity of remote computer systems. In Proceedings of USENIX Security Symposium, August 2003]. Genuinity relies on the premise that simulator-based program execution is bound to be slower because a simulator has to simulate the CPU architectural state in software, in addition to simulating the program execution. A special checksum function computes a checksum over memory, while incorporating different elements of the architectural state into the checksum. By the above premise, the checksum function should run slower in a simulator than on an actual CPU. While this statement is probably true when the simulator runs on an architecturally different CPU than the one it is simulating, an adversary having an architecturally similar CPU can compute the Genuinity checksum within the alloted time while maintaining all the necessary architectural state in software. As an example, in their implementation on the x86, Kennell and Jamieson propose to use special registers, called Model Specific Registers (MSR), that hold various pieces of the architectural state like the cache and TLB miss count. The MSRs can only be read and written using the special rdmsr and wrmsr instructions. We found that these instructions have a long latency (≈300 cycles). An adversary that has an x86 CPU could simulate the MSRs in software and still compute the Genuinity checksum within the alloted time, even if the CPU has a lower clock speed than what the adversary claims. Also, Shankar et al. show weaknesses in the Genuinity approach [U. Shankar, M. Chew, and J. D. Tygar. Side effects are not sufficient to authenticate software. In Proceedings of USENIX Security Symposium, pages 89-101, August 2004].
SWATT is a technique proposed by Seshadri et al. that performs attestation on embedded devices with simple CPU architectures using a software verification function [A. Seshadri, A. Perrig, L. van Doom, and P. Khosla. SWATT: Software-based attestation for embedded devices. In Proceedings of IEEE Symposium on Security and Privacy, May 2004]. The verification function is constructed so that any attempt to tamper with it will increase its running time. However, SWATT cannot be used in systems with complex CPUs. Also, since SWATT checks the entire memory, its running time becomes prohibitive on systems with large memories.
Other prior art dealing with software-based memory integrity verification that are based on computing hash values over code rely on the untrusted computer correctly performing the hash computation. See, for example, U.S. Pat. Nos. 6,567,917 and 6,925,566. Therefore, an attacker can defeat these techniques by simply subverting the hash computation of the untrusted computer.
Software Tamperproofing
Software tamperproofing is based on constructing self-checksumming code i.e. that computes checksums over its own instruction sequence. See, for example, U.S. Pat. No. 6,006,328. The claim is that doing so allows a piece of code to check its own integrity as it executes, without relying on an remote verifier. However, this claim is incorrect since all software tamperproofing techniques in existence today are vulnerable to two attacks: the Split-TLB Attack and Virtualization-based attacks. The Split-TLB Attack utilizes the existence of separate instruction and data translation look-aside buffers (TLB) in modern CPUs [G. Wurster, P. van Oorschot, and A. Somayaji. A generic attack on checksumming-based software tamper resistance. In Proceedings of IEEE Symposium on Security and Privacy, May 2005]. The attack desynchronizes the instruction and data TLBs so that the same virtual address translates to one physical address when used as an instruction pointer and translates to a different physical address than when used as a data pointer. Therefore, the Split-TLB Attack ensures that self-checksumming code computes checksums not over its own instructions but over a different copy of these instructions stored elsewhere in memory.
It is also possible to circumvent self-checksumming code by running the code inside a virtual machine (VM) hosted on top of a malicious virtual machine monitor (VMM). The VMM can undetectably interpose itself in to the execution of the self-checksumming code to ensure that the checksums computed by the code will be correct even though the code has been modified.
Accordingly, there is a need for improved methods, apparatuses, and systems for verifying code integrity and guaranteeing execution of code on untrusted computer platforms. Those and other advantages of the present invention will be described in more detail hereinbelow.