1. Field of the Invention
This invention relates to computer systems, and more particularly, to a new arrangement for naming input/output devices which is particularly useful in a new architecture which allows substantial increases in the speed and complexity of input/output operations in computer systems.
2. History of the Prior Art
In the 1960s, International Business Machines (IBM) and Control Data Corporation (CDC) produced mainframe computers with architectures in which a central processing unit (CPU) controlled program manipulation and separate input/output processors (called channel processors or peripheral processor units) controlled input/output operations. The input/output processors had instruction sets which allowed them to carry out the somewhat limited functions designated by commands placed in memory by the central processing unit. For example, the input/output processors knew how to access data on disk and place data on an output display. This form of architecture made, and in some cases still makes, a great deal of sense. At that time, central processing units were very expensive; and using the central processing unit to accomplish input/output operations was very wasteful. Neither the CDC nor the IBM input/output processors were as powerful as the central processing unit and thus could be produced relatively inexpensively. These architectures allowed individual computers to be built to emphasize operations by the central processing unit or operations by the input/output devices. By building a faster central processing unit, the main processing functions could be made to go faster; while by building faster input/output processors, the input/output operations could be accelerated.
As an example of this type of operation, in the IBM system, the central processing unit would signal which input/output operation it desired by writing channel commands to main memory and signaling a channel processor that there was something for it to do. The channel processor would read those commands and proceed to execute them without aid from the central processing unit. If an input/output processor was instructed to do something, it would do it. As long as the operation was safe, there was no problem. Unfortunately, if the operation was something prohibited like reformatting the hard disk which contained the basic operating system, the input/output processor would also do that.
These architectures were designed to allow programs to time share (multi-task) the central processing unit. With an operating system which allows multi-tasking, it is necessary to protect the resources allotted to one application program from operations conducted by other application programs so that, for example, one program cannot write to memory over the space utilized by another program. An important part of this protection is accomplished by keeping application programs from writing directly to portions of the system where they might cause harm such as main memory or the input/output devices. Since the input/output processors would do whatever they were instructed in the IBM and CDC systems, it was necessary to limit access to these input/output processors to trusted code, generally operating system code and device drivers, in order to preclude application programs from undertaking operations which would interfere with other application programs or taking other actions commanded by application programs which would wreak havoc with the system. Apart from any other problems, writing directly to the input/output devices creates a security problem in a multi-tasking system because the ability to write to and read from input/output devices such as the frame buffer means an application program may read what other programs have written to the device. For these reasons, both the IBM and CDC architectures kept any but privileged operating system code from writing to operating system memory and to the input/output devices.
In 1971, the Digital Equipment Corporation (DEC)PDP 11 computer appeared. In the original embodiment of this architecture, all of the components of the computer are joined to a system backplane bus. The central processing unit and any other component of the computer (except main memory) addresses each other component as though it were an address in memory. The addresses for the various hardware components including input/output devices simply occupy a special part of the memory address space. Only the address itself indicates that a component is a device such as an input/output device which is other than memory. When the central processing unit wants to accomplish an input/output operation, it simply writes or reads addresses assigned to the particular input/output device in memory address space. This architecture allows all of the operations available to the central processing unit to be utilized in accomplishing input/output operations and is, therefore, quite powerful. Moreover, this allows the input/output operations to be accomplished without the need for special commands or for special resources such as input/output processors. It also allows the use of very simple input/output controllers which typically amount to no more than a few registers.
As with the earlier IBM and CDC architectures and for the same reasons, writing to the input/output devices directly by other than trusted code is prohibited by the PDP 11 operating systems. The PDP 11 architecture provides a perfect arrangement for handling this. This architecture, like some of its predecessors, incorporates a memory management unit designed to be used by an operating system to allow the addressing of virtual memory. Virtual memory addressing provides access to much greater amounts of memory than are available in main memory by assigning virtual addresses to data wherever it may be stored and translating those virtual addresses to physical addresses when the data is actually accessed. Since operating systems use memory management units to intercept virtual addresses used by the central processing unit in order to accomplish the virtual-to-physical address translation, operating systems may simply provide no virtual-to-physical translations of any input/output addresses in the memory management unit for application programs. Without a mapping in the memory management unit to the physical addresses of input/output devices, the application program is required to use a trusted intermediary such as a device driver in order to operate on an input/output device in the PDP 11 architecture.
Thus, in a typical computer system based on the PDP 11 architecture, only trusted code running on the central processing unit addresses input/output devices. Although this architecture allows all of the facilities of the central processing unit to be used for input/output, it requires that the operating system running on the central processing unit attend to all of the input/output functions. Requiring a trap into the system software in order to accomplish any input/output operation slows the operation of the computer. Moreover, in contrast to earlier systems, in this architecture, there is no process by which the input/output performance of the system can be increased except by increasing the speed of the central processing unit or the input/output bus. This is an especial problem for programs which make heavy use of input output/devices. Video and game programs which manipulate graphics extensively and make extensive use of sound suffer greatly from the lack of input/output speed.
This problem is especially severe because when only trusted code can access input/output devices, then all accesses must be through this trusted code. That means that each operation involving input/output devices must go through a software process provided by the operating system and the input/output device drivers. The manner in which this is implemented is that when an application program is running on the central processing unit, the addresses it is allowed to access are mapped into the memory management unit by the operating system. None of these addresses may include input/output addresses. When an application program desires to accomplish an input/output operation, it executes a subroutine call into the operating system library code. This subroutine performs an explicit trap into the operating system kernel. As a part of the trap, the operating system changes the memory management unit to create mappings to the device registers. The operating system kernel translates the virtual name used for the input/output device by the application program into the name of a device driver. The operating system kernel does a permission check to ensure that the application is permitted to perform this operation. If the application is permitted to perform the operation, the operating system kernel calls the device driver for the particular input/output resource. The input/output device driver actually writes the command for the operation to the registers of the input/output hardware which are now mapped by the memory management unit. The input/output device responds to the command by conducting the commanded operation and then generates signals which indicate whether the operation has succeeded or failed. The input/output device generates an interrupt to the device driver to announce completion of the operation. The device driver reads the signals in the registers of the input/output device and reports to the operating system the success or failure of the operation. Then the operating system returns from the trap with the success or failure indication, restores the mappings for the application and thus removes the mappings for the device registers, and ultimately returns from the subroutine call reporting the success or failure of the operation to the unprivileged code of the application.
This sequence of steps must take place on each operation conducted using input/output resources. The process is inordinately long, and a recitation of the steps involved illustrates why applications using graphics or other input/output devices extensively cannot be run at any real speed on such systems.
This problem has been made worse by the tendency of hardware manufacturers to bias their systems in favor of write operations to the detriment of read operations. This bias has gradually increased as processors have become faster (the only way to accelerate a system having the PDP 11 architecture) while bus speed has tended to lag requiring that write operations on the bus be buffered. The interface in this type of architecture (including Intel X86 type systems) between input/output devices and the input/output bus includes a plurality of registers to which the central processing unit may write and which the central processing unit may read. Since write operations are buffered, all write commands in the write buffer queues must be processed through the buffers before any read can proceed. And during a read operation, the central processing system cannot conduct other operations since it must typically remain on the input/output bus in order to read synchronously the data being transferred. In some systems, some read operations take as much as twenty times as long as write operations.
Since the operating system running on the central processing unit must handle all of the reads and writes to input/output devices in this architecture, the central processing unit is further slowed by this hardware bias when dealing with input/output intensive applications. For example, manipulating graphic images typically requires extensive read/modify/write operations. Many application programs which make extensive use of input/output devices, including a great number of games, are unable to function with architectures which require that the operating system read and write to the output devices on behalf of the applications. In order to obtain the speed necessary to display their operations satisfactorily such programs must read and write to the input/output devices directly. This has always been allowed by the Microsoft DOS operating system but by none of the advanced operating systems such as Unix. Ultimately, with extensive urging by the windows system developers, the operating system designers of workstation operating systems have grudgingly allowed applications to read and write to the graphics circuitry directly by mapping some of the physical addresses which the input/output devices decode to their memory address space. This allows windows system developers to read and write to the graphics hardware directly even though the security and integrity of the system is compromised by so doing. There have also been multitasking system which have allowed application programs to write directly to the graphics hardware. However, these systems have required that the operation be accomplished using the operating system software to trap input/output accesses and accomplish context switching to assure that application programs do not interfere with one another; consequently, the result is significantly slower than desirable.
For all of these reasons, many games simply avoid multitasking operating systems such as windows systems. In general, games must be operated in single tasking systems such as Microsoft DOS which allows an unlimited form of writing directly to the input/output devices while sacrificing the integrity of the system.
It is very desirable to provide a new architecture which allows input/output operations to proceed at a faster speed so that application programs which make significant use of the input/output components may function in the advanced multi-tasking operating systems without sacrificing system integrity. In accomplishing this, it is desirable to allow application programs to write directly to input/output devices.
One problem which must be solved in dealing with a computer system which allows multitasking of application programs which may write directly to input/output devices is in assigning names for virtual input/output resources. In conventional multitasking operating systems, application programs use names for virtual input/output resources including both higher level resources which have no direct hardware equivalents such as files and lower level resources that do have direct hardware equivalents such as communication lines. Application programs use these names when invoking input/output operations on their input/output resources in order to instruct the operating system which must perform the input/output operation on their behalf which of the resources is to be operated on.
If the named input/output resource is a high level resource the operating system translates the logical operations on the virtual input/output resource into a set of physical operations on the hardware resources it is using to create the input/output resource. The operating system uses the device addresses of these hardware resources to invoke the hardware operations corresponding to the requested logical input/output operation. In effect, the operating system has translated the virtual input/output resource name into one or more device addresses.
If the named input/output resource is a lower level resource, the operating system translates the virtual input/output resource name to the corresponding hardware input/output device address and uses it to invoke the hardware operations corresponding to the requested logical input/output operations. Again, the operating has translated the virtual input/output resource name into a device address.
Since the operating system accomplishes the assignment of names for input/output resources, the application program must wait for the completion of the assignment before it can proceed with any of its operations. This slows the operation of the computer system.
In order to speed the operation of computer systems, it is desirable to eliminate the need for the operating system to assign names for input/output operations.
The names used for virtual input/output resources in conventional multitasking operating systems are typically small integers. Each application has a set of such integers (sometimes called "file descriptors") available. Normally an application program wanting to access a virtual input/output resource calls an operating system function (often called "open()") using a generic name allowed by the operating system. The operating system then assigns and returns a descriptor to the application program. Internally, the operating system uses these small integers to index into a tables of mappings to hardware devices kept for each application program. These mappings include data such as the physical address of the hardware input/output devices which are to accomplish the operation. The addresses used for physical input/output devices are normally assigned by the hardware, for example, by jumpers on the board implementing the input/output option. These addresses are interpreted by the hardware and are invisible to the application program. Mappings to the addresses are placed in the memory management unit only while trusted operating system code is running.
Because small integers are used by the operating system to name input/output resources, there are a limited number of resources which may be assigned names (for example, 256). Often, many more input/output resources may be needed to achieve an effect desired by an application program.
It is desirable to increase the number of input/output resources which may be named.
Moreover, this assignment of names using small integers assigned by the operating system requires an application program to keep track of these names so that it may use the names in accomplishing operations using the virtual input/output devices. Typically, the application maintains a data structure which includes various data relating to the virtual input/output devices including a cross reference to those names assigned by the operating system which the application program uses to accomplish its various operations.
It is desirable to eliminate the need for internal cross referencing of resource names by application programs.