1. Field
Embodiments of the present invention generally relate to the field of application execution environments. More particularly, embodiments of the present invention relate to application execution environments that are highly tuned for a particular class of hardware instruction set architectures and that employ the protective features of those instruction sets to reduce security vulnerabilities.
2. Description and Shortcomings of the Related Art
The approach adopted by modern general-purpose operating systems has been to define and implement multiple levels of abstractions on top of the actual processor hardware. Such abstractions include multiple virtual memories, multiple tasks (a.k.a. processes or threads), files, sockets, interrupt handlers, semaphores, spin locks, time of day clocks, interval timers, etc.
Some of these abstractions are implemented in the kernels of the respective operating systems, which typically exercise complete control over the actual computational resources of a processor. Such kernels execute at the highest privilege level provided by the processor, enabling the programs comprised by the kernel to execute the “privileged instructions” of the processor instruction set. Operating system kernels manage the creation, scheduling, coordination, and destruction of instances of such abstractions. They also provide for appropriate handling of the entire range of synchronous and asynchronous faults, traps, aborts, and interruptions defined by the hardware processor architecture.
Control of integrated or plug-in input/output (I/O) device control adapters are implemented by programs called drivers (a.k.a. I/O drivers or Local Area Network (LAN) drivers or <device> drivers, where <device> is a particular peripheral, bus, or function name). Such drivers also are permitted to execute at the highest privilege level provided by the processor. The amount of code comprised by the drivers usually is larger than the code for operating system kernels themselves.
Other elements implement abstractions built on top of the operating system kernel and I/O drivers. These include file systems, network stacks, synchronization primitives, signaling mechanisms, sockets interfaces, graphical user interfaces, and various libraries of system services. These elements combine with operating system kernels and I/O drivers to provide an interface to application programs that can be realized on many different hardware platforms.
The primary purpose in defining the multiple levels of abstraction provided by general-purpose operating systems has been to develop Application Programming Interfaces (APIs) that can be implemented across systems employing incompatible processor and platform hardware and firmware architectures. While the program of defining and implementing the multiple layers of abstraction found in today's Unix, Linux, and Windows operating systems (ULW systems), which may be referred to herein as “Principal Operating Systems,” is important, and has been successful in achieving portability, the result has not been achieved without performance penalties and other negative effects. Two primary such effects will be called the “lowest common denominator” (LCD) effect and the “semantic mismatch” (SM) effect. The first of these effects has resulted in the inability of ULW operating systems to benefit from powerful capabilities present only on some processors. The latter effect manifests either in excessive performance overheads or in system-level functional deficiencies such as scalability and security.
Operating system portability, particularly in ULW systems, has in practice led to two basic categories of consensus. First, there is a broad consensus among the ULW systems as to which abstractions are supported in an API. One cannot find, for example, significant differences among the virtual memory, process-thread-task, file, network, and interruption abstractions of the ULW systems. The uniformity among APIs, of course, enables application portability. Second, there is a consensus as to which subset of hardware capabilities are supported. This subset of capabilities properly can be labeled the architectural LCD.
In the mid 1960s, with the introduction of IBM's System/360, the operating system structure based upon two hardware-enforced levels of privilege was established. The operating system kernel (at the time called the “Nucleus”) and other critical system control code executed at the high hardware privilege level. Other code, including application codes, executed at the low hardware privilege level.
Although several important instruction set architectures subsequently have offered four levels of hardware privilege, as well as other advanced protective mechanisms, the ULW operating systems never have supported these features because such support could not also run upon those processors still providing only two levels of hardware privilege. In fact, due to the hardware LCD effect, the ULW operating systems today persist in supporting basically the 1960's privilege model, with a few extensions for read, write, and execute privilege controls. The only truly significant change has been the explosive growth in the amount of code that now executes at the highest level of hardware privilege, a result neither intended nor foreseen by the IBM System/360 architects.
More powerful addressing protection capabilities, such as those offered by PA-RISC® and the Itanium® systems, remain entirely unused by ULW operating systems. And for highly secure systems, in particular, there is compelling need to use such finer-grained memory protection capabilities, beyond those that are common to every manufacturer's processors. Support for such capabilities simply is unavailable from any of the ULW general-purpose operating systems, thereby making more difficult the construction of operating systems that can be highly secure. In ULW systems, for example, it is known to be unsafe to store cipher keys and cipher keying materials in main memory for long periods of time,1,2 even though this can be done safely using the protection capabilities provided by the Itanium architecture in the manner described in this Application. A computer architecture that includes at least the explicit instruction level parallelism and protection capabilities of the Itanium 2 processors shall be referred to herein as a “Parallel Protected Architecture” (PPA). 1 Niels Ferguson & Bruce Schneier, “Practical Cryptography”, Wiley, 2003.2 Adi Shamir & Nicko Van Someren, “Playing hide and seek with stored keys.” 22 Sep. 1998.
The first category of abstraction consensus provided by the ULW operating systems, like the hardware LCD consensus, also results in the collection of functional shortcomings which may be referred to herein as the SM effect. While the generally accepted operating system abstractions are suitable for a significant and broad class of applications, they are not ideal in every case. No computing structure can be all things to all applications. But having to map all applications into the generally accepted ULW API abstractions flies in the face of this fact. In important cases, the ULW operating system abstractions prevent full use of underlying hardware performance and protection capabilities.
Some applications simply cannot work within the limitations of ULW constraints. Obvious examples are real-time applications, where the system always must respond within strict time constraints. General-purpose operating systems usually provide parameters for tuning themselves for the best responses they are able to achieve. However, they cannot always meet the requirements of stringent real-time applications. System designers have addressed such problems in various ways. Some have embedded a general-purpose operating system within an underlying real-time kernel. In this structure, the real-time kernel controls the applications that require guaranteed responsiveness, and the general-purpose operating system controls the rest. Other designers have chosen specialized real-time operating systems, and simply abandoned the attempt to use general-purpose operating systems.
Many applications can be made to function within general-purpose operating systems, but only at the cost of overheads that can substantially reduce system performance. The abstractions provided by the principal general-purpose operating systems are realized only by complexity and the expenditure of lots of hardware cycles. The abstractions also have been found not to be low overhead constructs, particularly when considering scalability and security. Consequently, if an application's objectives include security, maximum possible throughput, and shortest possible response time, the consensus abstractions of general-purpose operating systems can constitute impediments to meeting these objectives.
For the most part, major ULW operating system developments always have resulted in longer schedules than estimated, larger resulting code bases than expected, and slower performance than desired. Catastrophes have been avoided, however, because the concurrent progress of hardware memory sizes and processor speeds have compensated for the size and performance shortfalls of operating system software. At the same time, little attention seems to have been paid to what application performance might be were it able fully to use the hardware advances of a PPA processor without the cumulative software overheads of general-purpose operating systems.