Concise references to prior art are tabulated and set forth in the “References” section below. For ease of readability, the system disclosed herein is hereinafter referred to as “Hard Object”.
Engineers who build machines made of atoms (rather than of software) rely on locality of causality to make machines mostly safe in the presence of failure or attacks: cars have a firewall between the engine and the driver; houses have walls and a lockable door between the inside and the outside. However, computer hardware engineers have worked very hard to eliminate all locality of causality within a computer: that is, on a modern computer, within any given process, any instruction can access any data in the entire address space of the process. Hardware engineers did this because giving the software engineers freedom to use any instruction to access any data makes it very easy to write programs that do what you really want; however having this much freedom also makes it very easy to write programs that do what you really do not want. Although software engineers separate programs into modules (code that exclusively maintains the invariants of its data), they lack appropriate fine-grain hardware primitives with which to efficiently implement enforcement of this separation. This state of affairs contributes to the problem that “machines made of software” (programs) tend to be much less reliable than machines made of atoms.
Software Correctness Generally
The punishing exactitude and overwhelming complexity of computer programs make the task of writing correct software almost impossible. Further, the stakes are high: we need only cite the title of a 2002 NIST study: “Software Errors Cost U.S. Economy $59.5 Billion Annually: NIST Assesses Technical Needs of Industry to Improve Software-Testing.” Due to software bugs (a) organized crime controls millions of computers, (b) large infrastructural projects are delayed or fail, and (c) people even die. The problem is that one can never do enough testing to ensure program correctness—something else is badly wanted.
Programmers follow certain disciplines designed to reduce mistakes, a common one being “modularity”—a software embodiment of locality of causality mentioned above: programs are separated into parts called “modules” where each module has its own data together with code to manage it. Further, to ensure correctness, the module's code is written in such a way as to maintain certain “data invariants”: properties of the module data which are always true. Some modules manage multiple instances of their data's state, each instance sometimes called an “object” and the module the “class” of the object. While this modularity discipline works well, current computer hardware systems do not protect a module within a program from any possibly errant or malicious behavior of other modules that may violate the module's boundaries; see FIGS. 1 and 2 for examples of one module 017 attacking the data of another 012. Therefore all modules are vulnerable to the threat of a single mistake, or a deliberate attack, from any one module: the correctness of the whole program is extremely brittle.
Modern Computers Generally
Modern microprocessors are organized in a fairly standard way. A very readable and thorough reference on this topic is Randal E. Bryant and David R. O'Hallaron “Computer Systems: A Programmer's Perspective” Prentice Hall 2003. At a high level of abstraction, a single-core microprocessor consists of a central processing unit 031, a random access memory 035, and peripherals 187 189 18A 18B.
The “central processing unit” (CPU) performs one of a fixed set of actions one after another according to the instructions of a program, much as a very reliable, tireless, obedient, and utterly unimaginative person might theoretically follow a detailed set of instructions. The CPU has a small amount of scratch space called “registers”; typically there are on the order of 100 or fewer registers to a CPU.
The “random access memory” (RAM) is a passive device which maps (1) an “address” to (2) a “datum” stored in a cell at that address, much as cubbyholes on a wall map each cubbyhole's number to the cubbyhole's contents. The CPU may either (1) write information to or (2) read information from a memory cell at a given address. While RAM size is also fixed, it is typically on the order of a billion (1 Gigabyte) cells.
The computer's CPU/RAM core is also connected to “peripherals”: external devices enabling interaction with the outside world, such as disk drives, displays, keyboards, mice, etc. To allow a program to interact with these devices, the hardware has either (1) special instructions for sending data to or receiving data from them, or (2) “memory-mapped I/O”: special RAM cells repurposed by the hardware such that writing or reading from these cells interacts with the device (rather than storing the data, as RAM cells would usually do).
A computer is typically designed to move several bits around together in a block, often called a “word”. A computer is characterized by the number of bits in its word, its “word size”, much as an engine is characterized by the total volume of its cylinders. Typical modern computers have 32-bit or 64-bit words. For specificity we speak of a 32-bit machine but nothing prevents the same ideas from application to machines of other word sizes.
Software
Information stored in RAM cells can be interpreted as either “data” or as “program”, as follows. There is one special CPU register called the “program counter” (PC) which contains an index into RAM where the next instruction to be followed by the CPU is held. The operation of the computer typically works as follows to “execute” a program:
(1) load the contents of the RAM cell pointed to by the PC,
(2) follow that instruction,
(3) increment the PC (unless the instruction set it to a new value),
(4) repeat.
Instructions are typically of one of the following kinds: (a) a data “access”, which is either a “read” (or “load”) of data from RAM into a CPU register, or a “write” (or “store”) of data from a CPU register into RAM, (b) a logical, fixed-point-arithmetic, or floating-point-arithmetic operation on two registers, or (c) a “branch” which sets the PC to a new value, sometimes only if a certain register has a certain value.
Writing and maintaining programs at the low abstraction level of these very small steps tends to be tedious, error prone, and mind-numbing. Therefore, programs are typically written in higher-level “programming languages” providing more useful constructs with which to construct programs. One of the most useful constructs is the “function”: a re-usable sub-program; a function has an “interface” specifying the format and meaning of data “argument(s)” passed as input and “return value(s)” obtained as output. Programs written in these higher-level languages are translated into executable machine instructions by a special program called a “compiler”.
Multi-Processing and the Kernel
A “multi-processing” computer can run more than one program at once, where each instance of a running program is called a “process”. Special software called the “kernel” runs in a special CPU mode called “kernel mode” which gives it extra powers over normal “user mode”. The kernel uses these powers to manage processes, such as putting them to “sleep” when a resource is requested and “waking” them up again when that resource is available.
Much like a city government, the kernel (mayor) coordinates with further special “software libraries” and “utility programs” (public servants) to: (a) provide commonly-needed but often messy utility services for the processes (citizens), such as interfacing to a particular kind of disk drive, and (b) protect the processes from each other (more on this below). Taken together the kernel and these utility libraries and programs are called the “operating system” (OS) (the city government in our metaphor). Users ask for services using a special hardware instruction called a “system call” or “kernel crossing”.
Whereas the kernel, just like a government, is the only agent with the power to take certain special actions, the kernel can take actions at the request of user processes if it determines that the user is allowed to take the action. That is, the hardware will allow certain operations only when in kernel mode, however these operations may be “wrapped” with a system call to allow the user to request the kernel to do the operation.
Further it is important to note that, just as in real life, asking the government to do something for you is slow; that is, for a user program to do a system call/kernel crossing is much slower than for a user function to simply call another user function. Therefore reducing the number of kernel calls in a program is an important efficiency concern.
Memory Management Generally
A program that needs only a fixed amount of memory during its run can allocate all of that state in one place at the start; such state is called “global” state (it is globally associated with the whole program) and is the first of three separate parts into which a process's memory is organized.
A particular function of a program needs its own local memory, called its “frame”. A “caller” function, ƒ, may invoke a “callee” function, g, to solve a sub-problem; during the execution of g, the execution of ƒ is suspended. The frame of memory for the execution of g is allocated immediately below (typically) that of ƒ, and when g is done, the memory that was g's frame may be re-used by a later call. That is, the frames “push” on and “pop” off, like a stack of plates, and so this second part of memory is called the “stack” 200. Note that since each function call has its own frame, a function ƒ may even call itself and the operation of the two instances of ƒ do not mutually interfere.
Sometimes a program requires “long term” data structures that also do not fit into the fixed-sized global state. The heap is managed by a system “memory allocator” library to which a program make make a request to have a specific amount of contiguous addresses or “space” reserved or “allocated” for a particular use. The library finds some available unused space and returns its initial address called a “pointer to” the space. Once in use for a specific purpose the space is typically called an “object”. When an object is no longer needed it can be “deleted” or “freed” for re-use by making a different call to the same memory allocator library. This third part of memory where such objects are allocated and freed has no simple organizational structure and is called the “heap” 010.
Virtual Memory
A problem arises in that there is sometimes not enough physical memory to store all of the data of all of the running processes. The usual solution is a scheme called “virtual memory”. Quoting [BO-2003, section 10.1 “Physical and Virtual Addressing”] (Note that any and all editing is in square brackets; emphasis of non-square-bracket text is in the original):                [M]odern processors designed for general-purpose computing use a form of addressing known as virtual addressing. (See figure [3a] [which is a copy of [BO-2003, FIG. 10.2]]. With virtual addressing, the CPU accesses main memory by generating a virtual address (VA), which is converted to the appropriate physical address before being sent to the memory. The task of converting a virtual address to a physical one is known as address translation . . . . Dedicated hardware on the CPU chip called the memory management unit (MMU) translates virtual addresses on the fly, using a look-up table stored in main memory whose contents are managed by the operating system.The Memory Hierarchy        
Thus the MMU 033, in cooperation with the operating system, stores some of the data from virtual RAM on physical RAM 035 and the rest on an external disk drive 046. Any process requesting access to data that is actually on disk is paused, the data is brought in (often requiring other data to be sent out), and then the process re-started. To support this feature, memory is grouped into “pages” that are moved in and out as a whole. Pages may be of different sizes, but in current practice 4-kilobytes is typical and for specificity we speak of this as the page size, though other sizes will work. The external device that stores the pages that are not in RAM is called the “swap” device 185.
We can see at this point that there are many kinds of memory, some with fast access and small capacity, some with slow access and large capacity, and combinations in between. These kinds of memory are arranged in “layers”, the fast/small layers used when possible and the slow/large layers used when necessary, as follows. (1) Most CPU instructions use CPU registers, access to which is very fast. (2) When the registers are full, the program resorts to using RAM, which is slower, but much larger. RAM actually has at least two layers: (2.1) small amounts of fast memory where frequently-used RAM address/data pairs are stored called the “cache”, and (2.2) normal RAM. Moving data between the cache and RAM is handled by the hardware. (3) As described above, when RAM is full, the operating system resorts to using a swap disk, which has huge capacity but is far slower still. This whole system is called the “memory hierarchy”.
Page Tables and Page Meta-Data
The MMU and/or OS clearly must track which virtual pages map to which physical pages or disk blocks. That is, for each page of data, further “meta-data” (data about data) is kept. Quoting [BO-2003, section 10.3.2 “Page Tables”]:                FIG. [4] [which is a copy of [BO-2003, FIG. 10.4]] shows the basic organization of a page table. A page table is an array of page table entries (PTEs). Each page in the virtual address space has a PTE at a fixed offset in the page table. For our purposes, we will assume that each PTE consists of a valid bit and an n-bit address field. The valid bit indicates whether the virtual page is currently cached in DRAM. If the valid bit is set, the address field indicates the start of the corresponding physical page in DRAM where the virtual page is cached. If the valid bit is not set, then a null address indicates that the virtual page has not yet been allocated. Otherwise, the address points to the start of the virtual page on disk.        The example in FIG. [4] shows a page table for a system with 8 virtual pages and 4 physical pages. Two virtual pages (VP 1, VP2, VP4, and VP7) are currently cached in DRAM. Two pages (VP 0 and VP 5) have not yet been allocated, and the rest (VP 3 and VP 6) have been allocated but are not currently cached.Process Address Spaces        
Another problem arises in that if all of these application processes use the same RAM it is difficult for them to cooperate in such a way as to not write on each other's data. The virtual-memory solution is for the operating system and hardware to present an illusion (or abstraction) that each process is the only process running on the computer and has all of RAM to itself; this abstracted RAM is the process's “(virtual) address space”. Quoting [BO-2003, section 10.4 “VM as a Tool for Memory Management”]:                To this point, we have assumed a single page table that maps a single virtual address space to the physical address space. In fact, operating systems provide a separate page table, and thus a separate virtual address space, for each process.        
Note however that sometimes multiple “lightweight processes” or “threads” are run in the same address space even on a machine that also runs processes in separate address spaces. One common design is that the kernel/operating system also manages these threads and another design is that user-mode (not kernel) “thread manager” software within a process manages them.
Memory Protection
Virtual memory thus prevents processes from accidentally or deliberately overwriting each other's data or that of the operating system itself. This protection aspect of virtual memory has become quite important. Quoting [BO-2003, section 10.5 “VM as a Tool for Memory Protection”]:                Any modern computer system must provide the means for the operating system to control access to the memory system. A user process should not be allowed to modify its read-only text section [that is, its executable program code]. Nor should it be allowed to read or modify any of the code and data structures in the kernel. It should not be allowed to read or write the private memory of other processes, and it should not be allowed to modify any virtual pages that are shared with other processes, unless all parties explicitly allow it (via calls to explicit interprocess communication system calls).        As we have seen, providing separate virtual address spaces makes it easy to isolate the private memories of different processes. But the address translation mechanism can be extended in a natural way to provide even finer access control. Since the address translation hardware reads a PTE each time the CPU generates an address, it is straightforward to control access to the contents of a virtual page by adding some additional permission bits to the PTE. FIG. [5] [which is a copy of [BO-2003, FIG. 10.11]] shows the general idea.        In this example, we have added three permission bits to each PTE. The SUP bit indicates whether processes must be running in kernel (supervisor) mode to access the page. Processes running in kernel mode can access pages for which SUP is 0. The READ and WRITE bits control read and write access to the page. For example, if process i is running in user mode, then it has permission to read VP 0 and to read or write VP 1. However, it is not allowed to access VP 2.        If an instruction violates these permissions, then the CPU triggers a general protection fault that transfers control to an exception handler in the kernel. Unix shells typically report this exception as a “segmentation fault”.        
As you can see, prior art systems usually partition pages into (a) “text” (or executable program code) and (b) “data”. After the program has been loaded into memory, text pages are marked to be executable and read-only by setting the permissions bits in the page table; similarly data pages are usually marked to be non-executable and read-write, though read-only data is possible.