The ability of intruders to hide their presence in a compromised system has surpassed the ability of the current generation of integrity monitors to detect them. Once in control of a system, intruders modify the state of constantly-changing dynamic kernel data structures to hide their privileges.
The foundation of the Trusted Computing Base (TCB) (National Computer Security Center, Department of Defense Trusted Computer System Evaluation Criteria, December 1985) used on most currently deployed computer systems is an Operating System that is large, complex, and difficult to protect. Upon penetrating a system, sophisticated intruders often tamper with the Operating System's programs and data to hide their presence from legitimate administrators and provide backdoors for easy re-entry. The Operating System kernel itself is a favored target, since a kernel modified to serve the attacker renders user-mode security programs ineffective. Many so-called “rootkits” are now available to automate this tampering. Rootkits are collections of programs that enable attackers who have gained administrative control of a host to modify the host's software, usually causing it to hide their presence.
Recent advances in defensive technologies, such as external kernel integrity monitors (D. Hollingworth and T. Redmond, Enhancing operating system resistance to information warfare. In MIL-COM 2000. 21st Century Military Communications Conference Proceedings, volume 2, pages 1037-1041, Los Angeles, Calif., USA, October 2000; X. Zhang, L. van Doorn, T. Jaeger, R. Perez, and R. Sailer, Secure Coprocessor-based Intrusion Detection. In Proceedings of the Tenth ACM SIGOPS European Workshop, Saint-Emilion, France, September 2002; N. L. Petroni, T. Fraser, J. Molina, and W. A. Arbaugh, Copilot—a Coprocessor-based Kernel Runtime Integrity Monitor. In 13th USENIX Security Symposium, San Diego, Calif., August 2004) and code attestation/execution verification architectures (R. Kennell and L. H. Jamieson, Establishing the Genuinity of remote Computer Systems. In Proceedings of the 12th USENIX Security Symposium, pages 295, 310, Washington, D.C., August 2003; A. Seshadri, A. Perrig, L. van Doorn, and P. Khosla; SWATT: SoftWare-based ATTestation for Embedded Devices. In IEEE Symposium on Security and Privacy, Oakland, Va., May 2004; A. Seshadri, M. Luk, E. Shi, A. Perrig, L. van Doorn, and P. Khosla, Pioneer: Verifying Code Integrity and Enforcing Untampered Code Execution on Legacy Systems. In Proceedings of the 20th ACM Symposium on Operating Systems Principles (SOSP), Brighton, United Kingdom, October 2005), have demonstrated their ability to detect the kinds of tampering historically performed by rootkits. However, rootkit technology has moved to a more sophisticated level. While these defensive technologies have focused on the relatively straightforward task of detecting tempering in static and unchanging regions of kernel text and data structures—typical targets of the previous generation of rootkits—the new rootkit generation has evolved to more sophisticated tampering behavior that targets dynamic parts of the kernel.
Seeking to avoid detection and subsequent removal from the system, intruders can hide their processes from legitimate administrators by modifying links in the Linux and Windows XP/2000 kernels' process tables. Due to the fact that the state of the process table changes continuously during kernel runtime, identifying these modified links is difficult for the current generation of kernel integrity monitoring tools that focus only on static data. Although this targeting of dynamic data was not entirely unanticipated by researchers (X. Zhang, L. van Doorn, T. Jaeger, R. Perez, and R. Sailer, Secure Coprocessor-based Intrusion Detection. In Proceedings of the Tenth ACM SIGOPS European Workshop, Saint-Emilion, France, September 2002), there has yet to be a general approach for dealing with this threat. In order to be effective against the latest rootkit technology, defensive mechanisms must consider both static and dynamic kernel data since changes in either can lead to the compromise of the whole.
The current monitoring tools disadvantageously rely upon the correctness of the monitored kernel in order to detect an intrusion, which often is an invalid assumption. After gaining full administrative control of a system, such intruders may modify some of the kernel's dynamic data structures to their advantage. For example, in a GNU/Linux system, an intruder may remove tasks from the Linux kernel's all-tasks list in order to hide them from the system's legitimate administrators.
Alternatively an intruder may modify an entry in the Linux kernel's SELinux access vector cache to temporarily elevate their privileges and disable auditing without making visible changes to the SELinux policy configuration. Neither of these techniques expose flaws in the Linux kernel or its SELinux security module. These examples represent the potential acts of an intruder who has already gained full control of the system—perhaps by exploiting the trust or carelessness of the system's human operators in a manner entirely outside the scope of the system's technological safeguards.
Rootkits have evolved beyond the historical methods of hiding processes which included modifying the text of the ps program to mislead legitimate administrators or causing the kernel itself to “lie” by replacing the normally-static values of kernel text or function pointers. This is exemplified by system call vector or jump tables in the /proc filesystem with addresses of malicious functions. Even the most sophisticated threats became easy to detect by monitors that could compare the modified values against a known-proper value. In a “healthy” system, these values should never change.
However, attackers do not need to modify kernel code to hide processes within a running kernel. In fact, they do not have to rely on manipulating the control flow of the kernel in any way. Instead, adversaries have found techniques to hide their processes even from correct, unmodified kernel code. By directly manipulating the underlying data structures used for process accounting, an attacker can quickly and effectively remove any desired process from the view of standard, unmodified administrator tools. While the process remains hidden for accounting purposes, it continues to execute as normal and will remain unaffected from the perspective of the scheduler. To understand how this state is achieved, a brief overview of Linux 2.6 process management is provided in the following paragraphs.
The primary data structure for process management in the Linux kernel is the task_struct structure (R. Love, Linux Kernel Development. Novell Press, Second edition, 2005). All threads are represented by a task_struct instance within the kernel. A single-threaded process will therefore be represented internally by exactly one task_struct. Since scheduling occurs on a per-thread basis, a multi-threaded process is simply a set of task_struct objects that share certain resources such as memory regions and open files, as well as a few other properties including a common process identifier (PID), which is a unique number given to each running process on the system.
In a correctly-running system, all task_struct objects are connected in a complex set of linked lists that represent various groupings relevant to that task at a particular time. For accounting purposes, all tasks are members of a single doubly-linked list, identified by the task_struct.tasks member. This list, which is referred to as the all-tasks list, insures that any kernel function needing access to all tasks can easily traverse the list and be sure to encounter each task exactly once. The head of the task list is the swapper process (PID 0), identified by the static symbol init_task. In order to support efficient lookup based on PID, the kernel also maintains a hash table that is keyed by PID and whose members are hash-list nodes located in the task_struct.pid structure. Only one thread per matching hash of the PID is a member of the hash table; the rest are linked in a list as part of task_struct.pid member. Other list memberships include “parent/child” and “sibling” relationships and a set of scheduler-related lists.
Scheduling in the Linux kernel is also governed by a set of lists. Each task exists in exactly one state. For example, a task may be actively running on the processor, waiting to be run on the processor, waiting for some other event to occur (such as I/O), or waiting to be cleaned up by a “parent” process. Depending on the state of a task, that task will be a member of at least one scheduling list somewhere in the kernel. At any given time, a typical active task will either be a member of one of the many wait queues spread throughout the kernel or a member of a per-processor run queue. Tasks cannot be on both a wait queue and a run queue at the same time.
Primed with this knowledge of the internals of Linux process management, a trivial technique by which an attacker can gain the ultimate stealth for a running process is described in the following paragraphs.
FIG. 1 depicts the primary step of the attack: removing the process from the doubly-linked all-tasks list (indicated by the solid line 10 between tasks). Since this list is used for all process accounting functions, such as the readdir( ) call in the /proc filesystem, removal from this list provides all of the stealth needed by an adversary. For an attacker who has already gained access to kernel memory, making this modification is as simple as modifying two pointers per hidden process. As a secondary step to the attack, adversaries might also choose to remove their processes from the PID hash table (not shown) in order to prevent the receipt of unwanted signals.
As shown in FIG. 1, a task not present in the all-tasks list can continue to function because the set of lists used for scheduling is disjoint from the set used for accounting. The dashed line 12 shows the relationship between objects relevant to a particular processor's run queue, including tasks that are waiting to be run (or are currently running) on that processor. Even though the second depicted task is no longer present in the all-tasks list, it continues to be scheduled by the kernel. Two simple changes to dynamic data therefore result in perfect stealth for the attacker without the necessity of modification to static data or kernel text.
When most actions occur in the kernel, some form of a capability is used to identify whether or not a principal should be given (or already has been given) access to a resource. These capabilities therefore represent a prime target for attackers wishing to elevate privilege. Changing process users identifiers (UIDs) has long been a favorite technique of attackers. Other examples include file descriptors and sockets (both implemented in the same abstraction in the kernel).
The SELinux access vector cache (AVC) provides a good example of this kind of capability and represents a potential target for an adversary seeking privilege escalation.
SELinux (P. A. Loscocco and S. D. Smalley, Integrating Flexible Support for Security Policies into the Linux Operating System. In Proceedings of the FREENIX Track: 2001 USENIX Annual Technical conference, Boston, Mass., June 2001) is a security module for Linux kernels that implements a combination of type Enforcement (W. E. Boebert and R. Y. Kain, A Practical Alternative to Hierarchical Integrity Policies. In Proceedings of the 8th National Computer Security Conference, pages 18-27, Gaithersburg, Md., September 1985) and Role-based (D. Ferraiolo and R. Kuhn, Role-Based Access Controls. In Proceedings of the 15th National Computer Security Conference, pages 554-563, Baltimore, Md., October 1992) mandatory access control, now included in some popular GNU/Linux distributions. During runtime, SELinux is responsible for enforcing numerous rules governing the behavior of processes. For example, one rule might state that the DHCP (R. Droms, Dynamic host configuration protocol. Technical report RFC 2131, Bucknell University, March 1997) client daemon can only write to those system configuration files needed to configure the network and the Domain Name Service, but no others. By enforcing this rule, SELinux can limit the damage that a misbehaving DHCP client daemon might cause to the system's configuration files should it be compromised by an adversary (perhaps due to a buffer overflow or other flaw).
To enforce its rules, SELinux must make numerous decisions during runtime such as “Does the SELinux configuration permit this process to write this file?” or “Does it permit process A to execute program B?” Answering these questions involves some overhead. Thus SELinux includes a component called the access vector cache (AVC) to save these answers. Whenever possible, SELinux rapidly retrieves answers from the AVC, resorting to the slower method of consulting the policy configuration only on AVC misses.
SELinux divides all resources on a system (such as processes and files) into distinct classes and gives each class a numeric Security Identifier or “SID.” It expresses its mandatory access rules in terms of what processes, with a particular SID, may and may not do to resources with another SID. Consequently, at a somewhat simplified abstract level, AVD entries take the form to tuples:
<ssid, tsid, class, allowed, decided, audit-allow,audit-deny>The ssid field is the SID of the process taking action, the tsid field is the SID of the resource the process wishes to act upon, and the class field indicates the kind of resource (file, socket, etc.). The allowed field is a bit vector indicating which actions (read, write, etc.) should be allowed and which should be denied. Only some of the allowed field bits may be valid. As an example, if the previous questions answered by SELinux have involved only the lowest-order bit, then that may be the only bit that contains a meaningful 0 or 1. SELinux may or may not fill in the other allowed field bits until a question concerning those bits is provided. To distinguish a 0 bit indicating “deny” from a 0 bit indicating “invalid,” the decided field contains a bit vector with 1 bits for all valid positions in the allowed field. The audit-allow and audit-deny fields are also bit vectors; they contain 1 bits for operations that should be logged to the system logger when respectively allowed or denied.
It is conceivable that adversaries who have already gained administrative control over a system might wish to modify the SELinux configuration to give their processes elevated privileges. They could accomplish this most directly by modifying the SELinux configuration files, but such modifications would be easily detected by filesystem integrity monitors like Tripwire (G. H. Kim and E. H. Spafford, The Design and Implementation of Tripwire: a File system Integrity Checker. In Proceedings of the 2nd ACM Conference on Computer and Communications Security, pages 18-29, Fairfax, Va., November 1994).
Alternately, they might modify the in-kernel data structures representing the SELinux configuration which is the same data structures SELinux consults to service an AVC miss. However, these data structures change infrequently, when administrators decide to modify their SELinux configuration during runtime. Consequently, any tampering may be discovered by a traditional kernel integrity monitor that performs hashing or makes comparisons with correct, known-good values.
The state of the AVC, on the other hand, is dynamic and difficult to predict at system configuration time. Entries come and go with the changing behavior of processes. An adversary might insert a new AVC entry or modify an old one to effectively add a new rule to the SELinux configuration. Such an entry might add extra allowed and decided field bits to grant additional privileges, or remove existing audit-allow and audit-deny field bits to run off troublesome logging. Such an entry would override the proper in-memory and on-disk SELinux configuration for as long as it remained in the cache.
Current monitoring tools are limited to detecting changes in nominally static kernel data and text and cannot distinguish a valid state change from tampering in these dynamic data structures. The methods, which can be characterized by calculating hashes of static kernel data and text and comparing the result to known-good values, are not applicable to the continuously changing dynamic data structures now being targeted by rootkits.
It is therefore a growing need in the information security area to for an advanced defensive mechanism capable of detection of integrity violations of both static and dynamic kernel data and which is independent of the correctness of a monitored host system.