1. Field of the Invention
The present invention relates to a central processing unit for use in data processors, and more particularly to an input/output control mechanism for central processing units of the pipelined architecture.
2. Description of Related Art
Heretofore, a central processing unit (called "CPU" hereinafter) has been adapted to access to an external input/output device (called "I/O device" hereinafter) by the following two methods: The first method is that when the CPU executes an input/output (I/O) instruction, data is transferred between the CPU and a selected I/O device in accordance with the I/O instruction and a read/write control signal. The second method is a so-called "memory mapped I/O", in which the addresses of I/O devices are allocated to a portion of the address space prepared for a main memory, and when instructions such as read/write instructions and arithmetic and logic instructions are executed, if an address is allocated to an I/O device, data is transferred between the CPU and the indicated I/O device in accordance with a read/write control signal.
In the memory mapped I/O as mentioned above, it is possible to designate not only a main memory but also an I/O device by using operands such as read/write instructions, ordinary arithmetic and logical operation instructions and data transfer instructions. Therefore, this is very effective in elevating the performance and function of microprocessors.
However, the memory mapped I/O has encountered the following problems which have recently been caused by speed-up of processors themselves. In low speed CPUs, it is possible to access both a main memory and I/O devices in similar input/output control manners. But, such access is not possible in high speed CPUs which operate at a clock frequency of not less than 8 MHz. This is true because, a high speed access to a main memory can be realized by adopting new architecture such interleaving, Nibble mode access, and page mode access, and in contrast, the I/O device cannot intrinsically adopt such new architectures. As a result, various differences are caused between the access to the main memory and the access to the I/O devices in the timings such as cycle time and recover time and in the access method, and therefore, the same control system cannot be applied both to the access to the main memory and the access to the I/O devices. Thus, the CPU of the memory mapped I/O system needs additional peripheral circuits in order to overcome the above problems. Namely, the required hardware becomes more complicated.
Specifically, in case that the page mode access is adopted in the CPU, which can designate the operand address by units of a byte, an operand is often bridged over a plurality of pages. In such a case, the CPU has to access to the I/O device in a manner different from that of the access to the memory.
Furthermore, the CPU of the memory mapped I/O system cannot internally discriminate between the main memory space and the I/O space. The CPU does not have the internal capability of interrupting the instruction execution in the course of access to the I/O device and then executing an internal interruption (this is called "I/O access trap function"). The CPU also cannot allow a user to settle a privileged level dedicated for the I/O access (this is called "I/O privileged level designation function"). If the CPU is required to have the above functions, the CPU also needs additional peripheral circuits and special interruption processing functions.
As mentioned above, in the memory mapped I/O system, since a portion of the address space is used as the I/O space, the entire address space cannot be used for the main memory. Therefore, if a large I/O space is reserved, for example, if the I/O space has the same capacity as that of the main memory so that the I/O space and the memory space can be distinguished by the most significant bit of the address, it is sufficient that a small-sized external address decoder is provided for distinguishing between the I/O space and the memory space. In this case, the address decoder itself for such a purpose can be omitted. In this case, however, the main memory becomes too small, and therefore, cannot have a sufficient flexibility or degree of freedom. On the other hand, if only a small I/O space is provided in comparison with the whole of the address space, the utilization efficiency and the degree of freedom in the main memory space and the I/O space are increased, but a large-scaled address decoder is required. In addition, in order to enable the I/O space to be allocated to a voluntary portion of the address space, the address decoder has to be not only large-scaled but also very complicated.
As is apparent from the above description, in the memory mapped I/O system, the proportion of the I/O space to the whole of the address space is in a trade-off relation to the scale of the external address decoder. This is not convenient to the structure of the system.
Moreover, a so-called pipeline architecture is well known as one means for parallel instruction execution which enables one to elevate the performance of the CPU. In this pipelined design, the CPU is ordinarily divided into several units so that respective steps in executing an instruction are carried out in separate units within the CPU to perform the instruction. For example, in an instruction decoding unit, a given instruction is converted into the form which can be directly executed in an instruction execution unit. In an effective address computation unit, thereafter, an effective address (or virtual address) is computed from a displacement value, an index value, a base value, etc., and, the effective address is translated into a real address in an address translation unit, while the second instruction is converted in the instruction decoding unit. Then, a read/write operation to the main memory or the I/O device is controlled in a main memory control unit, and on the other hand, the operation is executed by an instruction execution unit on the basis of the result obtained from the instruction decoding unit. At this time, the third instruction is converted in the instruction decoding unit, and the second instruction is subjected to the address translation.
As seen from the above, the pipelined architecture can be compared to an automobile assembly line. Specifically, an operand reading instruction is executed through the instruction decoding, the effective address computation and address translation, the operand reading, and the instruction execution in the named order. On the other hand, an operand writing instruction is executed through the instruction decoding, the effective address computation and address translation, the instruction execution and the operand writing in the named order. The above mentioned processing steps are actually executed in parallel as shown in FIG. 1 in the pipelined operation. Therefore, the total time of operation is shortened, with the result that the execution time of each instruction is apparently minimized. Accordingly, the CPU can operate at a high speed.
However, as seen from FIG. 1, the operand reading for a reading instruction is often executed prior to the operand writing for a writing instruction, although the writing instruction is given ahead of the reading instruction. On the other hand, when the writing is performed in the I/O device, the status and control of the I/O device are changed, and so, the succeeding reading from the I/O device is changed. As a result, as mentioned above, if the sequence of actual accesses to the I/O device is reversed in comparison with the sequence of I/O instructions to be executed, the control of the I/O device will get out of order.
In this circumstance, there has recently been proposed a CPU provided with a virtual memory management mechanism, which has removed some of problems in memory management. Namely, a voluntary portion of a virtual memory space can be allocated to I/O devices and the remaining portion can be used as a main memory. In addition, it is possible to control the access right to the I/O devices, similarly to the case for the main memory, and also to protect the I/O device from illegal access.
However, if the real address space is of the memory mapped I/O system, problems attributable to the memory mapped I/O system are still present.
For reference, if the virtual memory management is performed in the CPU which does not have the memory mapped I/O system, the CPU can internally distinguished between the main memory access and the I/O access, and therefore, it is possible to obtain the I/O access trap function and the I/O privileged level designation function. However, since the real address space does not include I/O addresses, one cannot designate I/O devices by instruction operands. In addition, the access right to the I/O devices cannot be controlled by the virtual memory management mechanism, and so, flexible protection for I/O devices is difficult to realize.