1. Field of the Invention
The present invention relates to an information processing system and, more particularly, to an information processing system having a function either to restore the mnemonic of an instruction having been executed in a processor such as a microprocessor, which is integrated on a semiconductor chip, in the execution order from the informations appearing at (e.g., inputted to or outputted from) chip terminals such as the address bus terminal and data bus terminal of the processor, or to restore (or trace the instruction) both the addresses accessed actually as an operand by the processor in accordance with the mnemonic instruction and the data of that address in case the executed instruction has the operand outside of the processor. The information processing system is effective for debugging the software to be developed by the processor.
2. Description of related art
A first description to be made is directed to the internal structure of a microprocessor having a precedence control function of an instruction, especially to the pre-fetch control of an instruction code, a queue status and a bus status. A next description will be made on how the microprocessor of the prior art uses the queue status and the bus status to execute an instruction trace. A further description will be made on the internal structure of a multi-function microprocessor having a multi-stage pipeline structure developed from the aforementioned microprocessor of the prior art and on the drawbacks in case the aforementioned instruction trace is executed by that multifunction microprocessor.
In the microprocessor of the prior art in which all the basic operation sequences of the microprocessor: instruction code fetch --instruction decoding --operand access --instruction execution --operand access are sequentially conducted, the instruction trace can be performed merely by sequentially following the informations appearing in the address bus terminal and the data terminal while the microprocessor is executing a series of instructions. On the other hand, in a microprocessor of the type in which the instruction code fetch of the aforementioned basic operation sequences of the microprocessor is assigned to an independent unit so that it may be controlled in precedence, the informations appearing during the processing of a series of instructions of the microprocessor at the address bus terminal, the data bus terminal and the output terminal of a status signal indicating the internal status of the instruction code fetching unit are once latched as time-series data in the trace buffer memory. Both the instruction code fetched in the processor by the instruction code fetch controlled in precedence and the subsequent execution result of that instruction code are edited correspondingly by the status signal so that the instruction trace can be conducted. Thus, according to the instruction tracing method of the microprocessor, the informations appearing at the terminals of the microprocessor are traced. As a result, if the internal structure of the microprocessor is changed so as to conduct the precedence control of the aforementioned instruction code fetch, it accordingly becomes necessary to output the minimum necessary information indicating the internal status for the instruction trace from the minimum number of terminals and to change the instruction tracing method itself.
FIG. 4 is a block diagram showing the internal structure of a processor according to the prior art. Reference numeral 401 designates a processor chip, and numeral 402 designates a bus terminal through which addresses and data are inputted and outputted. Numeral 403 designates an internal bus; numeral 404 a data register, numeral 405 an address generator; and number 406 an instruction queue and its control of 6 bytes. The queue status information of the processor 401 is generated by the instruction queue control 406 and is outputted to the outside of the processor 401. An instruction code issued from the instruction queue 406 is transferred to and latched by an instruction latch 411 through a signal line 408 and an internal data bus 409. Numeral 412 designates an instruction code decoder which is made receptive of the output of the instruction latch 411. Numeral 410 designates an instruction execution control. Numerals 413, 414 and 415 designate various registers, an arithmetic and logical unit (i.e., ALU), and an operand register for the ALU, respectively. Numeral 416 designates a bus control for controlling the bus cycles of the processor 401, the generation of a bus status signal, and so on. Numeral 417 designates a signal line for outputting therethrough the bus status signal generated by the bus control 416 to the outside of the processor 401.
The instruction coding/pre-fetching, decoding and executing operations of the processor 401 will be described with reference to FIG. 4.
In the instruction coding/pre-fetching operations of the processor 401, a pre-fetching address generated by the address generator 405 is outputted through the bus terminal 402 to a not-shown system address bus, and an instruction code outputted from a corresponding area in a not-shown memory to a not-shown system data bus is fetched from the bus terminal 402 and is latched through the internal bus 403 in the trailing block of the instruction queue 406 having the FIFO (i.e., First In First Out) structure. Here, if the data bus of the processor 401 has a length of 16 bits, a 2-byte (i.e., 1 byte=8 bits) instruction code can be stored in the instruction queue 406 by a single instruction coding/pre-fetching operation.
The processor 401 latches a 1-byte instruction code per instruction sequentially from the head of the instruction queue 408 in the instruction latch 411 through the signal line 408 and the internal bus 409 and uses the instruction decoder 412 to decode that instruction code. The control information generated as the decoding result is transferred to the instruction execution control or the like so that the operation described in the instruction code decoded by the decoder 412 is instantly executed. When the execution of this operation is completed, the processor 401 repeats the sequence of fetching a subsequent one instruction code from the instruction queue 412 into the instruction latch 411 and decoding and executing it.
For example, in the case where a read operand is required in the memory as a result that the instruction code fetched from the instruction queue 406 and latched in the instruction latch 411 is decoded by the decoder 412, it is first read out from the memory after the decoding of the instruction code. The instruction execution control 410 gives the address generator 405 the information necessary for generating the address of the read operand in the memory and then commands the start of the memory read bus cycle. In response to this command from the instruction execution control 410, the bus control 416 uses the address generated by the address generator 405 to start the memory read bus cycle, stores the data obtained from the memory in the data register 404, and transmits it to the instruction execution control 410 that the memory read has been completed. The instruction execution control 410 uses the content of the data register 404 to perform the necessary processing, when it receives from the bus control 416 the information of the memory read completion responding to the memory read bus cycle starting command issued previously. As a result, the bus control 416 cannot execute the instruction because the memory read bus cycle commanded by the instruction execution control 410 cannot be started, although it receives from the bus control 416 the memory read bus cycle starting command from the instruction execution control 410, provided that the system data bus and the system address bus are occupied by another bus master or the like. This instruction execution is deferred until the bus control 416 acquires the using right of the system data bus and the system address bus to start the memory read bus cycle and to fetch the data from a predetermined area in the memory and write it in the data register. In other words, the actual bus cycle of this processor 401 for the memory of I/O access of an instruction requiring this access is restricted while the instruction is being executed.
A software has to be debugged by tracing the instruction so that it may be developed by using the processor described above. The instruction tracing is one of the useful techniques for the software development and for obtaining information necessary for examining: what result is obtained when a certain instruction is executed by a CPU; what influence is exerted as a result of execution of one instruction upon subsequent instructions; what influence is exerted upon an instruction program being executed in case an external interruption is made while one instruction or routine is being executed; and so on.
Since the processor is enabled to pre-fetch the instruction code, as described above, the timing at which the instruction fetched through the data I/O terminal of the processor in the processor is to be executed depends upon the internal status of the processor. This processor is enabled to output queue status signals QS0 and QS1 representing the statuses of the instruction queue and bus status signals S0 to S2 representing the kinds of the bus cycle. As a result, the statuses of the instruction queue are understood from the queue status signals, as listed in Table 1, and the kinds of the bus cycle are understood from the bus status signals, as listed in Table 2, as follows:
TABLE 1 ______________________________________ Q1 Q0 Symbol ______________________________________ 0 0 Queue without Fluctuation N 0 1 Queue at 1st Byte of Instruction code F 1 0 Queue Empty E 1 1 Queue at or after 2nd Byte S ______________________________________ PG,9
TABLE 2 ______________________________________ S2 S1 S0 Symbol ______________________________________ 0 0 0 Interrupt Acknowledge IA 0 0 1 Read I/O Port I 0 1 0 Write I/O Port O 0 1 1 Halt HLT 1 0 0 Code Access F 1 0 1 Read Memory RM 1 1 0 Write Memory WM 1 1 1 Passive P ______________________________________
The information of a trace buffer memory in an instruction tracer is edited by realizing a pseudo instruction queue having the same action as that of the instruction queue in the processor by means of the software of the instruction tracer and by inversely assembling an instruction code in the trace buffer memory in the actual processing order of the processor with reference to the status of the pseudo instruction queue to establish an instruction mnemonic while simulating the instruction queue in the processor in accordance with the advance of the processing on the basis of the queue status signal and bus status signal of each address of time-series data in the trace buffer memory. In case, moreover, the instruction executed is one accompanied by a memory access or an I/O access, it is necessary to inversely assemble the instruction code to establish the instruction mnemonic and also to determine what address and data are actually used to conduct the memory access or the I/O access.
The instruction tracer of the aforementioned processor has the flowing major functions: a trace buffer memory function to store the address bus information and data bus information being executed by the processor, and the control signal status information; a break point setting function; a queue status emulating function; and a function to editing the information in the trace buffer memory. While the processor is executing the program, the instruction tracer sequentially fetches those predetermined ones of data and signals in the trace buffer memory, which appear at the individual buses and input/output signal terminals as the execution progresses. The trace buffer memory being implemented at the present has one word of about 64 bits and a capacity of about 64 words to 2 kilo-words. The user is enabled by interrupting the execution of the program at a break point set by the break point setting function and then by checking the software developed in view of the result of edition of the information in the trace buffer memory.
Here, since the processor has the pre-fetching function, the instruction inputted through the data bus terminal is not directly decoded/executed but once fetched in the instruction queue and caused to wait. This wait is not constant because it depends upon the jam of the instruction queue, the kind of the instruction being executed, and so on. In the case of the processor being described, the status of the instruction queue and the kind of the bus cycle at present can be outputted as the queue status information QS0 and QS1 and the bus status information S0 and S2, respectively, from the terminal to the outside. As a result, the instruction tracer fetches the queue status and bus status signals together with another information in the trace buffer memory and refers them, after it interrupts the program at the break point, to edit not only the information stored as the time-series data in the trace buffer memory but also what address's instruction code is to be decoded or executed and what memory access in the trace buffer memory the memory or I/O access established in that procedure corresponds to.
The principle of this edition will be briefly described in the following. In case one instruction is fetched from the instruction queue and is being decoded and executed, the execution of a subsequent instruction is deferred until the instruction being executed is completed. As a result, even if the instruction accompanied by the memory or the I/O access is two succeeding ones, for example, their order of the memory or the I/O access is identical to that of the instructions written in the program. Moreover, the difference between the time at which the instruction accompanied by the memory or the I/O access is fetched from the memory and the time of the memory access at which it is fetched from the instruction queue and is decoded and executed depends upon the status of the instruction queue, but the time of the memory or the I/O access coincides with that at which the instruction is executed. With these prerequisites, the trace buffer memory will be described in the following.
FIG. 5A and 5B present one example of the data which are written in the trace buffer memory. Each frame corresponds to each address of the trace buffer memory, at which the information of one word to be once fetched in each frame is written. The frames are numbered in the order of their having been fetched in the trace buffer memory. Letters BHE* designate a byte high enable signal, and the asterisk * designates an active low status. The BHE* signal indicates that data are outputted to the higher half of the data bus. Letters STS designate the bus status signals S0 to S2 in the symbols listed in Table 2. Letters Q0T0 designate the queue status signal QS1 and QS0 in the symbols listed in Table 1. Letters QDEPTH designate the number of the bytes of the instruction codes stored in the instruction queue. Letters DMUX represent what purpose the trace buffer memory fetches for. Letters A, D and Q designate an address, a data and a status, respectively. A bus cycle similar to that of the processor is assigned to the instruction tracer so that the tracer monitors the trace data such as the address, the data or the status for each system clock of the processor. Since the address changes in a bus cycle T1, the trace data of one word is fetched at the cycle T1 in the trace buffer memory. At this time, the address A is written in the DMUX. Since the data at the address outputted at T1 is fixed in an identical bus cycle T4, one frame is fetched in the trace buffer memory at T4. At this time, the data D is written in the DMUX. The bus status is effective for a time period of bus cycles from T2 to T4. The bus status fetched in the trace buffer memory immediately before that period is fetched one frame in the trace buffer memory only in case its content changes. The queue status does not depend on the bus cycle basically, but the trace data fetched immediately before in the trace buffer memory is fetched one word only in case its queue status content changes. When the status is fetched, the status Q is written in the DMUX. In this processor, all the address data and status are received in a time-series manner at the common terminal and are multiplexed. This makes it necessary to provide an index such as the DMUX so that the address (A), the data (D) or the status (Q) may be discriminated. The QDEPTH is not the information obtained directly from the terminal of the processor but is calculated from the queue status (QSTS) and the bus status (STS) in the following manner.
Incidentally, there are several pseudo instruction queue framing methods, one of which will be described with reference to FIG. 6. Reference numeral 601 designates a memory space having a length of 1 byte, which is sufficient for retaining the instruction trace. In the processor, the instruction code is pre-fetched every 2 bytes, and the bus status and the DMUX are detected in view of F and D, respectively. In order to simulate the instruction code pre-fetching operation, therefore, the pre-fetched instruction code is written in the memory address indicated by a pseudo instruction queue write pointer (WP) while incrementing each content of the pointer WP by 2. In order to simulate that the instruction code is fetched from the instruction queue, on the other hand, the instruction queue is fetched from the memory address indicated by a pseudo instruction queue read pointer (RP), and the content of this pointer (RP) is incremented by 1, because that fetch can be detected from the queue status.
The value of the QDEPTH is sequentially calculated for the actual operation starting from the reset time of the processor by a hardware circuit 700 disposed in the instruction tracer, as shown in FIG. 7. The calculated result is contained in the information of one frame to be fetched in the buffer memory by the instruction tracer. In FIG. 7, reference numeral 701 designates a signal line for transmitting therethrough the queue status signal, and numeral 702 designates a decoder for decoding the queue status signal on the signal line 701. From the decoder 702, there are outputted two signals: the signal on a signal line 703 takes a high level when the queue status signal indicates the symbol F or S of Table 1; and the signal on a signal line 707 takes the high level when the queue status signal indicates the symbol E of Table 1. Letter 704 designates a modulo-6 up counter, which has its value incremented by 1 at each detection of a rising edge at which the signal on the line 703 transits from the low to high levels. Numeral 705 designates the output signal line of the counter 704, which has a length of 3 bits. Numeral 708 designates a signal line for transmitting therethrough the reset signal of one system including the processor and the instruction tracer, and the reset signal is actively high. Numeral 709 designates an AND gate made receptive of the signal on the line 707 and the reset signal on the line 708. Numeral 710 designates the output signal line of the AND gate 709. Numeral 711 designates a signal line for transmitting the bus status signal from the processor to a decoder 712. Numeral 713 designates the output signal line of the decoder 712, and the signal on the line 713 takes the high level when the bus status signal indicates the symbol F (i.e., code fetch) of Table 2. Numeral 714 designates a modulo-6 up counter which is made receptive of the signal on the line 713. The counter 714 has its value incremented by 2 each time it detects one rising edge of the signal on the line 713. Numeral 715 designates the output signal line of the counter 714, which has a length of 3 bits. Numeral 706 designates a subtracter. This subtracter 706 is made receptive of the respective output signals of the counters 704 and 714 through the signal lines 705 and 715, respectively, to subtract the value of the output signal of the counter 704 from the value of the output signal of the counter 714 and to output the resultant difference to a signal line 716.
The operations of the QDEPTH calculating hardware 700 of FIG. 7 will be briefly described in the following. The queue status signal and bus status signal of the processor are sequentially transmitted through the signal lines 701 and 711, respectively, to the inside of the QDEPTH calculating hardware 700. When the reset signal 708 of the system takes the high level, the output signal of the AND gate 709 also takes the high level and is inputted through the signal line 710 to the counters 704 and 714. These counters 704 and 714 have their outputs reset to zero when the signal on the line 710 takes the high level. When the processor starts its operation to pre-fetch the instruction code in response to a reset signal, the bus status represents the symbol F (i.e., code fetch) of Table 2 so that the counter 714 takes an output at 2. Since the processor latches the instruction code of 2 bytes in the instruction queue by a single pre-fetch operation, the decoder 713 increments the output of the counter 714 by 2 each time the bus status signal once detects the symbol F of Table 2. The fact that the processor fetches the instruction code of 1 byte from the instruction queue is grasped from the fact that the queue status signal on the signal line 701 detects the symbol F (First) or S (Subsequent) of Table 1, and the decoder 702 increments the output of the counter 704 by 1 each time the symbol F or S is detected. The result of subtracting at an instant by the subtracter 706 the value of the output of the counter 704 indicating the fetch of the instruction code from the instruction queue, from the value of the output of the counter 714 indicating the instruction code pre-fetch indicates the queue depth QDEPTH at that instant. When the symbol E (Queue Flush) of Table 1 is detected as a result that the queue status signal is decoded by the decoder 702, moreover, the signal on one output signal line 707 of the decoder 702 takes the high level. As a result, the signal on the output signal line 710 of the AND gate 709 takes the high level so that the counters 704 and 714 are reset. These resets of the counters 704 and 714 by the signal on the signal line 707 correspond to that the processor purges the content of the instruction queue.
Thus, the increment and decrement of the instruction code to be inputted to or outputted from the instruction queue are simulated by the hardware circuit 700 for QDEPTH calculation of FIG. 7. The simulated result is sequentially fetched as a part of the information of one frame, which is to latch the signals on the output signal line 716 altogether in the buffer memory of the instruction tracer, together with the informations of the data bus, address bus, queue status and bus status signals for a time period from the instant when the processor is reset to the instant when a series of operations are interrupted at the break point address.
Next, the method of inversely assembling the instructions, which are actually executed by the processor on the basis of the data in the trace buffer memory of FIG. 5 and accompanied by the I/O or memory access, to correspond to the I/O or memory access of the processor accompanying the execution of the instructions will be described in the following. In the precedence control, the time period from the latch to the execution of the instructions of the processor is not constant. However, the following description will be made with a prerequisite that the order of execution of the instructions is identical to that of fetch in the instruction queue, so long as the instructions do not contain one such as a branch instruction having a possibility of changing the sequence of the program.
When the instruction having been executed is to be traced, the status of an empty instruction queue is searched for and used as a reference for the trace. It is found from the queue status E when the instruction queue is emptied. In FIG. 5, for example, the QSTS represents E at frames 0002 and 0022. The instruction code fetched in the instruction queue by the instruction fetch immediately after the case of the queue status E is placed at the head of the instruction queue so that it is decoded and executed at first. When the instruction code has a variable byte length (e.g., an instruction of 1 to 6 bytes), it indicates the number of bytes of the instruction in which the numbers of one F from the F to a subsequent F of the queue status and S are summed. As shown in FIG. 5, after the QSTS represents E at the frame 0002, an address is outputted (as A in DMUX) at a frame 0003, and the data is fetched (as D in DMUX) at a frame 0004. Since the STS represents F, however, the data fetched at the frame 0004 is the instruction code and accordingly the instruction fetch immediately after the QSTS represents E at the frame 0002 and the instruction queue is empty. As a result, the instruction code fetched at the frame 0004 is at the head of the instruction queue at the frame 0004. Since the QSTS represents F at a frame 0006, it is found that the instruction code fetched at the frame 0004 has been fetched by the execution part. Since the QSTS represents S at frames 0007 and 0010 until the QSTS represents F at a frame 0011 after the frame 0006, it is found that the instruction having been fetched at the frame 0004 and begun to be executed from the frame 0006 has 3 bytes. Since, moreover, the address is issued at frame 0005 and since the instruction code is pre-fetched at a frame 0008, the instruction having begun to be executed from the frame 0006 has instruction codes BA, EA and FF (all in a hexadecimal notation), and provides mnemonics MOV, DX and FFEA if inversely assembled. Subsequently, the method of determining the byte number of the instruction code with reference to the empty status of the instruction queue at the frame 0002 to obtain the code of one instruction is repeated.
The instructions of the frame 0006 to 0010 are followed by those of frames 0011 to 0014. The queue status represents F at the frame 0011 but not S until it subsequently represents F at a frame 0015. As a result, the instructions having been executed at the frames 0011 to 0014 have 1 byte and a code EC (in hexadecimal notation) of 1 byte of the instruction codes of 2 bytes fetched at the frame 0008 other than the instruction codes of the frames 0006 to 0010. If the EC (in hexadecimal notation) is inversely assembled, there are obtained mnemonics IN, AL and DX. At the frames 0013 and 0014, the bus status STS represents I, which implies that the data indicated at the step of the frame 0014 has been fetched from the I/O port corresponding to the address outputted at the step of the frame 13. In other words, the I/O accesses established in the executing procedures of the instructions IN, AL and DX are those at the steps of the frames 0013 and 0014.
Thus, with reference to the frames at which the queue status signal represents the symbol E of Table 1, the code of one instruction is selected from the queue status signal, the bus status signal, the DMUX index, the data on the data bus, and the address on the address bus and is inversely assembled to obtain the instruction mnemonic, which is made to correspond to the memory or I/O access necessary for the instruction execution so that the instructions executed actually by the processor can be restored. If the trace buffer memory has one frame for the reference, the instructions on and after that frame can be restored sequentially one by one on the basis of the queue status signal, the bus status signal and the data on the data bus when in the instruction code pre-fetch. On the other hand, the instructions on and before the reference frame can be restored by tracing reversely of the time lapse from the reference frame with the use of the queue status signal, the bus status signal, the data on the data bus when in the instruction code pre-fetch, and the information of the queue depth QDEPTH.
The results of the instruction restoration (or trace) on the basis of the data in the trace buffer memory, as shown in FIG. 5, are presented in FIG. 8.
Next, the processor having the multi-stage pipeline structure will be described in the following (which processor will be temporarily called the "A processor").
FIG. 9 is a block diagram showing the internal structure of the A processor. Reference numeral 901 designates an A processor chip; numeral 902 an address bus signal line of the A processor, Numeral 903 a data bus signal line of the A processor; numeral 904 a data bus in the A processor; and numeral 905 an instruction pre-fetch composed of an instruction queue having the FIFO structure and an instruction queue control. Numeral 906 designates a signal line for transferring therethrough the head of the instruction queue in the instruction pre-fetch 905 to an instruction decoding unit 907, which in turn is composed of an instruction decoder and a sequencer for controlling the operation of the instruction decoding unit 907. Numeral 908 designates a decoded information queue having the FIFO structure for latching such one of the informations decoded by the instruction decoding unit 907 as for instructing the operation of an instruction execution unit 910. Numeral 909 designates a signal line for transmitting therethrough the information from the decoded information queue 908 to the instruction execution unit 910. This instruction execution unit 910 is constructed of a hardware composed of an arithmetic and logical unit and a variety of registers. Numeral 911 is an address generator for conducting address calculations in response to an information or command from the instruction decoding unit 907. Numeral 912 designates a signal line for transmitting therethrough the address generated by the address generator 911 to an address bus interface of a bus control 913. This bus control 913 controls the bus of the A processor and can be divided functionally and roughly into three parts: the address bus interface part; a data bus interface part; and a bus cycle control part. The address bus interface part has a function to output the address determined with respect to the system address bus of the system including the A processor from the address bus 902 of the A processor and is constructed of a hardware composed of a group of address registers 919 for latching a variety of addresses. The data bus interface part is constructed of a hardware composed of temporary registers 916 for fetching and temporarily latching data from the system data bus through the data bus 903 of the A processor, and a temporary register 917 for temporarily latching data generated in the A processor before the data is outputted to the system data bus. The bus cycle control part of the bus control 913 controls the bus cycle of the A processor, the bus status output and the input/output of the signal from the external terminal of the A processor. Numeral 915 designates a signal line for bidirectionally transmitting the information instructing the operation of the bus control 913 from the instruction decoding unit 907 and the information indicating the internal status of the bus control 913 to the instruction decoding unit 907, respectively. Numeral 914 is a signal line for both sending the information necessary for the address generation and the information for instructing the operation of the address generator 911 from the instruction decoding unit 907 to the address generator 911 and transmitting the information indicating the internal status of the address generator 911 from the address generator 911 to the instruction decoding unit 907. Numeral 918 designates a signal line for transferring the data generated in the instruction execution unit 910 to the temporary register 917 disposed in the data bus interface part of the bus control 913 and from the temporary registers 916 disposed in the data bus interface part of the bus control 913 to the instruction execution part 910. Numeral 920 designates a bus status signal line of the A processor, and numeral 921 designates a signal line for informations such as the memory read signal or the I/O write signal to be outputted to the outside of the A processor. Numeral 922 designates a queue status signal line of the A processor.
The A processor is constructed of the instruction pre-fetch 905, the instruction decoding unit 907, the address generator 911, the bus control 913, and the instruction execution unit 910. The instruction pre-fetch 905 has a hardware composed mainly of an instruction queue having the FIFO structure for latching a pre-fetched instruction code. The instruction decoding unit 907 has a hardware composed of a decoder for an instruction code of several bytes fetched from the instruction queue, a sequence for administering the operation of the instruction decoding unit, a program counter, and the decoded information queue 908 having the FIFO structure for latching the decoded information of an instruction. The address generator 911 has a hardware for an addressing. The bus control 913 is composed mainly of the address bus interface part, the data bus interface part, and the bus cycle control part. The address bus interface part is composed of a group of registers 919 such as a pre-fetching address register, a read operand address register and a write operand address register. The data bus interface part is composed of: the temporary registers 916 for temporarily latching the data obtained by the read access of the A processor to the outside; and the temporary register 917 for temporarily latching data for the write access of the A processor to the outside. The bus cycle control part is composed mainly of a sequencer. The instruction execution unit 910 is composed of an arithmetic and logical unit, a register file, a shifter and a control circuit for controlling the instruction executions.
The hardware of the A processor to be used for describing the present invention is determined as follows. The instruction queue of the instruction pre-fetch 905 has a length of 1 byte and a capacity of 8 bytes at the maximum. The instruction decoding unit 907 fetches an instruction code of 1 byte from the instruction queue in response to one clock. The decoded information queue 908 has a capacity for latching a decoded information of two instructions. Two temporary registers 916 of the bus control 913 are provided for one word of data and designated at RDR1 and RDR2, respectively. One temporary register 917 is provided for one word of data and is designated at WDR. Pre-fetching one of the address registers 919 is provided for one address information and is designated at PAR. Two read operand ones of the address registers 919 are provided for one address information and are designated at RAR1 and RAR2, respectively. The address register RAR1 corresponds to the temporary register RDR1, and the address register RAR2 corresponds to the temporary register RDR2. Write operand one of the address registers 919 is provided for one address information and is designated at WAR. This address register WAR corresponds to the temporary register WDR.
FIG. 10 is a block diagram showing the major components of the address bus interface part of the bus control 913 of the A processor. The signal line 912 is identical to that of FIG. 9 and transfers the address generated by the address generator 911. Reference numerals 1002, 1005, 1008 and 1011 designate a group of address registers, which are divided according to the kinds of addresses to be stored into a pre-fetching address register (PAR) 1002, a read operand address register 1 (PAR1) 1005, a read operand address register 2 (PAR2) 1008, and a write operand address register (WAR) 1011. Numerals 1003, 1006, 1009 and 1012 designate the respective latch signals of the registers 919 (i.e., 1002, 1005, 1008 and 1011). Numerals 1004, 1007, 1010 and 1013 designate the output signal lines of the registers 1002, 1005, 1008 and 1011, respectively. Numeral 1014 designates a multiplexor which made is receptive of the signals on the signal lines 1004, 1007, 1010 and 1013 for selecting only one of the four inputs without fail in response to a selection signal. The output of this multiplexor 1014 is fed to the address bus 902. The four latch signals 1003, 1006, 1009 and 1012 and the selection signal of the multiplexor 1014 are generated in the instruction decoding unit 907 and are transferred through a signal line 915. The register 1002 has an automatic address updating (or incrementing) function none of other three registers 1005, 1008 and 1011 have. If the address on the signal line 912 is once latched in the register 1002 in response to the latch signal 1003, the signal on the output signal line 1004 of the register 1002 is selected as the output of the multiplexor 1014 so that the register 1002 has its content incremented by a constant quantity each time the bus control 913 has its instruction code pre-fetching memory read bus cycle completed. Thanks to said automatic updating function of the register 1002, the instruction decoding unit 907 has its load lightened because the content of the register 1002 need not be updated in each bus cycle for the instruction code pre-fetch after it has once set the register 1002 with the instruction code pre-fetching address, until it subsequently has to set the register 1002 with a new instruction code pre-fetching address in response to a branch instruction.
Next, the operations of the A processor will be described in summary.
The instruction code latched in the memory is fetched in the A processor by the instruction code pre-fetching function of the A processor. This pre-fetching action is shared among the instruction decoding unit 907, the address generator 911, the bus control 913 and the instruction pre-fetching 905. In this pre-fetching action, when the pre-fetching address is written in the pre-fetching address register PAR of the bus control 913 from the bus generator 911 in response to the command from the instruction decoding unit 907, the updating of the address register PAR is conducted by the bus control 913 until the content of the register PAR is subsequently rewritten in response to the command of the instruction decoding unit 907. The bus control 913 performs the pre-fetching action if the system address bus and system data bus coupling the A processor and the system are empty. The bus control 913 outputs not only the content of the address register PAR to the system address bus but also the code or the instruction code fetch to a bus status signal BST2-0 to bring an MMIO signal accessible to the memory. The bus control 913 fetches therein the content (i.e., the instruction code) of the addres in the memory corresponding to the pre-fetching address outputted to the system address but and transfers it to the instruction queue of the instruction pre-fetch 903 to update the content of the register PAR. Then, the instruction pre-fetch 905 latches the instruction codes transferred from the bus control 913 sequentially in the tail of the instruction queue. This instruction pre-fetch 905 has a major function to control the action of the instruction queue and the output of a queue status QST3-0. The major function of the bus control 913 is the memory access, the I/O access and the control of the input/output of most of the control signals. The major function of the address generator 911 is address calculations for various addressings. The major function of the instruction decoding unit 907 is to decode the instruction code and to control the actions of the individual parts of the A processor. The major function of the instruction execution unit 910 is to process the data actually such as the arithmetic and logical calculations.
The statuses of the A processor will be described in the following. The A processor has two kinds of queues: the instruction queue of the instruction pre-fetch 905 and the decoded information queue of the instruction decoding unit 907. In order to exhibit the statuses of these two kinds of queues, the A processor outputs the queue status signal QST3-0 of 4 bits to the outside. The QST3-0 signal takes the high level for one clock period, when the decoded information of one instruction at the head of the decoded information queue is fetched to the instruction execution unit 910, and indicates that a new instruction begins to be executed by the instruction execution unit 910. A signal QST2-0 exhibits the statuses of the two kinds of queue of the A processor, as encoded in Table 4. The queue status signals are generated in the instruction decoding unit 907. The bus statuses are so encoded by a bus status signal BST2-0, the signal MMIO indicating the memory access or the I/O access, and a signal RDWR indicating the read access or the write access as is listed in Table 5. The MMIO signal represents the memory access when at the high level, the I/O access when at the low level, and no access when in the high impedance state. The RDWR signal represents the read access when at the high level, the write access when at the low level, and no access when in the high impedance state. The bus status signal BST2-0, the MMIO signal and RDWR signal are generated in the bus control 913.
The instruction decoding unit 907 fetches the instruction codes of one instruction by 1 byte sequentially from the instruction code at the head of the instruction queue and latches them in the instruction code register. The instruction decoder receives the output of the instruction code register and decodes the instruction code. As a result of the decoding at the instruction decoding unit 907, there are generated: an information for instructing the action of the instruction execution unit 910, an information necessary for the action of the instruction execution unit 910, an information for instructing the action of the address generator 911, an information necessary for the action of the address generator 911, an information for instructing the action of the bus control 913, and an information necessary for the action of the bus control 913. The information relating to the instruction execution unit 910 is latched in the decoded information queue 908. The informations relating to the address generator 911 and the bus control 913 are not especially latched in the instruction decoding unit 907. As a result, when the content of the instruction code register is updated, the informations relating to the address generator 911 and the bus control 913 will change. The informations sent from the instruction decoding unit 907 and relating to the address generator 911 and the bus control 913 are latched in the address generator 911 and the bus control 913, respectively. In case, however, the address generator 911 or the bus control 913 is disabled to receive the information sent from the instruction decoding unit 907 by its internal status, an information (i.e., a busy signal) indicating incapability of receiving the information given at present is resent to the instruction decoding unit 907. This instruction decoding unit 907 finishes its decoding action of the instruction code of one instruction by latching the information relating to the action of the instruction execution unit 910 in the decoded information queue 908. If the decoded information queue 908 is occupied and if the busy signal is received from the address generator 911 or the bus control 913, the instruction decoding unit 907 interrupts its decoding action. Even in this case, the instruction pre-fetch 905 is controlled.
The instruction decoding unit 907 not only monitors the clogging of the instruction queue of the instruction pre-fetch 905 at all times but also controls the pre-fetch demand of the instruction code. For example, in case the instruction queue is emptied, namely, in case an unconditional branch instruction is found as a result of decoding the instruction code of one instruction of several bytes fetched from the instruction queue at the instruction decoding unit 907, this unit 907 commands the instruction pre-fetch 905 to instantly purge all the content of the instruction queue; commands the address generator 911 to generate and transfer an address to be branched to the bus control 913; and commands the bus control 913 to rewrite the content of the pre-fetching address register PAR by a new pre-fetch address information transferred via the address generator 911 and to start the pre-fetching action from a new branched address. The pre-fetching action from the branched address immediately after the decoding of the branch instruction is given the highest preference order of all the events in the A processor, that need the use of the external bus. More specifically, the pre-fetching actions from the branched address immediately after the branched instruction decoding are sequentially continued until an instruction code of a predetermined number of bytes is latched in the instruction queue, and the external bus for the memory access necessary for the instruction execution cannot be used during the period of the pre-fetching action. In accordance with the commands from the instruction decoding unit 907 thus far described, the pre-fetching action of the instruction code by the A processor will proceed.
The sequences of the instruction coding unit 907 and the instruction execution unit 901 in response to a series of instructions are completely identical. Let the case be considered as a simple example, in which instructions (1) and (2) other than the branch ones are stored in the recited order in areas of a memory having continuing addresses. By the pre-fetching function of the A processor, the instructions codes each having several bytes are fetched in the order of the instructions (1) and (2) in the instruction queue of the instruction pre-fetch 905, and the decoding of the instruction (2) is deferred until the instruction decoding unit 907 finishes the decoding of the instruction (1) to store the necessary decoding information in the decoded information queue 908. As a result, the instruction execution unit 910 also fetches the decoded information of the instruction (2) at the head of the decoded information queue 908 to execute the instruction (2) after it has finished the execution of the instruction (1) by using the decoded information of the instruction (1) fetched from the decoded information queue 908.
In case the instruction decoded in the instruction decoding unit 907 is one latched in the memory or requiring the read operand for the I/O, the read operand is read in precedence. The read operand is read in advance prior to the time at which the instruction requiring the read operand is executed in the instruction execution unit 910, but this preceding read cannot always be conducted previously depending upon the statuses and conditions of the inside and outside of the A processor. The advanced read of the read operand will be described in the following. When the instruction code of the instruction requiring the read operand is latched in the instruction code register of the instruction decoding unit 907 and is inputted to and decoded by the decoder, the advanced reading action of the read operand is designated as the information relating to the actions of the address generator 911 and the bus control 913 by the instructions decoding unit 907. In the actions for the advanced read of the read operand assigned to the address generator 911 by the instruction decoding unit 907, the information necessary for the address generation of the read operand is issued from the instruction decoding unit 907 to the address generator 911 to the advanced reading address of the read operand, and this advanced reading address generated is transferred to the bus control 913. In the actions for the advanced read of the read operand assigned to the bus control 913 by the instruction decoding unit 907, the advanced reading address of the read operand transferred via the address generator 911 to the bus control 913 is latched in one of the two read operand address register PAR1 and PAR2 in the address bus interface part of the bus control 913 by designating that particular one, and the read bus cycle is started to read and latch the read operand in the read operand data register corresponding to that read operand address register. In the actions for the advanced read of the read operand assigned to the bus control 913 by the instruction decoding unit 907, no timing is designated. The bus control 913 starts, when it receives the information for the aforementioned advanced read of the read operand from the instruction decoding unit 907, the read bus cycle for the advanced read of the read operand in accordance with the statuses and conditions of the outside system data bus, the system data bus and the system address bus of the A processor to latch the read operand in the read temporary register designated. The starting timing of the read bus cycle for the read operand advance road is not controlled by the instruction execution unit 910 and the instruction decoding unit 907, but the bus control 913 is controlled independently of the remaining units.
The advanced read of the read operand is detected at the decoding stage of the instruction decoding unit 907 of the instruction code requiring the read operand. The instruction decoding unit 907 assigns the aforementioned actions to the address generator 911 and the bus control 913 and allocates the read operand register RDR1 or RDR2 to be read-accessed by the instruction execution unit 910, as an information for designating the action of the instruction execution unit 910 in the decoded information queue 908 after the bus control 913 has read the read operand. For example, in case the instruction decoded in the instruction decoding unit 907 is one requiring two read operands in the memory, the bus control 913 is instructed to latch the address of the first operand in the register RAR2 and accordingly the data of the first operand in the register RDR2 and the address of the second operand in the register RAR1 and accordingly the data of the second operand in the register RDR1. In this case, the information is so latched in the decoded information queue 908 to finish the decoding of said instruction that the instruction execution unit 910 may execute the instruction by using the content of the register RDR2 for the first operand data and the content of the register RDR1 for the second operand data. In case both the two read operands in the memory have been transferred in advance from the memory to the registers RDR1 and RDR2 when the decoded information of said instruction is to be fetched from the head of the decoded information queue 908 and executed by the instruction execution unit 910, this unit 910 can obtain the data of the two operands existing intrinsically in the memory merely by making accesses to the registers in the A processor. Although this memory access requires several clocks, the instruction execution unit 910 appears to have accessed to one read operand with one clock, in case the data of the read operands in the memory could be transferred in advance to the registers RDR1 and RDR2 (in this case) in the bus control 913 in the A processor prior to the instant of requirement of the instruction execution unit 910, so that the instruction execution efficiency and still the better the throughput of the A processor can be improved.
In case, however, the external system data bus and system address bus of the A processor are occupied for a long time by another bus master, the start of the read bus cycle for the advanced read of the read operand is deferred at the earliest until the occupation of the system bus is returned from the bus master effective at present to the A processor even if the bus control 913 receives the information concerning the advanced read of the read operand furnished from the instruction decoding unit 907 during the decoding period of the instruction code of the instruction requiring the read operand at the instruction decoding unit 907. Even if the advanced read of the read operand is thus deferred, the instruction execution unit 910 executes the instruction requiring actually the read operand, no matter whether the read operand might have been read in advance or not, to read out the content of the read operand register of the bus control 913 designated by the instruction decoding unit 907. When the read operand is not read in advance in the read operand register designated by the instruction decoding unit 907, the instruction execution unit 910 interrupts its action and waits until the bus control 913 starts the read bus cycle to read the data from the read operand and latch it in the read operand register designated by the instruction decoding unit 907 so that the instruction execution unit 910 is allowed to access to said read operand register. When the action of the instruction execution unit 910 is interrupted, the decoded information queue 908 is also halted so that the decoding action at the instruction decoding unit 907 is interrupted, thus halting all the major operations of the A processor. When the bus control 913 starts the read bus cycle to latch the read operand in the read temporary register designated by the instruction decoding unit 907, the action of the instruction execution unit 910 is reopened together with the decoding action of the instruction decoding unit 907, thus reopening the operations of the A processor. In this example, simultaneously as the instruction requiring the read operand in the memory or I/O is decoded in the instruction decoding unit 907, the advanced reading action of the read operand is assigned from the instruction decoding unit 907 to the address generator 911 and the bus control 913 to latch the decoded information of said instruction in the decoded information queue 908. After this, the advanced read of the read operand is not performed until the instruction processing at the instruction execution unit 910 proceeds to execute said instruction from the head of the decoded information queue 908. In this case, the execution efficiency obtainable is similar to that of the aforementioned processor.
The order of a series of instructions to be decoded in the instruction decoding unit 907 and the order of read operands to be read in advance in the bus control 913 are completely identical. For example, let the case be considered in which two instructions (3) and (4) both requiring read operands in a memory are latched in the recited order in an instruction queue. The instruction decoding unit 907 first fetches the instruction code (3) of several bytes of the instruction (3) from the instruction queue to decode it, and assigns a memory read access for the advanced read of the read operand (3) of the instruction to the bus control 913 to latch the decoded information concerning to the instruction (3) in the decoded information queue 908. Next, the instruction decoding unit 907 fetches the instruction code (4) of several bytes of the instruction (4) to decode it, and assigns the memory read access for the advanced read of the read operand (4) of the instruction (4) to the bus control 913. As a result, the bus control 913 receives the memory read access designation for the advanced read of the read operand (3) and then the memory read access designation for the advanced read of the read operand (4) of the instruction (4) from the instruction decoding unit 907 to read the read operands in advance in the order of their receptions so that the orders of the flow of the series instructions and the advanced read of the read operands at the bus control 913 are identical.
In case the instruction decoded in the instruction decoding unit 907 requires a write operand in the memory or I/O, the write operand is post-written. This post-write of the write operand will be briefly described because the write operand is equivalent to the read operand. When the instruction code of an instruction requiring the write operand is latched in the instruction code register of the instruction decoding unit 907 so that it is inputted to and decoded by the decoder, the post-write of the write operand is designated as the information relating to the actions of the address generator 911 and the bus control 913 by the instruction decoding unit 907. In the actions of post-writing the write operand assigned to the address generator 911 by the instruction decoding unit 907, the address generator 911 is furnished from the instruction decoding unit 907 with the information necessary for generating the address of the write operand, and this address thus generated for post-writing the write operand is transferred to the bus control 913. In the actions for post-writing the write operand assigned to the bus control 913 by the instruction decoding unit 907, the write operand post-writing address transferred via the address generator 911 to the bus control 913 is latched in the write operand address register WAR in the address bus interface part of the bus control 913, and the instruction requiring the write operand is executed in the instruction execution unit 907. After having latched the execution result of said instruction in the write temporary register WDR 917, the instruction execution unit 907 starts the write bus cycle to write the content of the register WDR out to the address latched in the address register WAR. In the actions for post-writing the write operand assigned to the bus control 913 by the instruction decoding unit 907, the timing for starting the write bus cycle is not designated. In the bus control 913, the information for post-writing the write operand is obtained from the instruction decoding unit 907, and the instruction execution result from the instruction execution unit 910 is latched in the temporary register WDR. After this, the write operand is written out when the memory access or the I/O access is possible, depending upon the statuses and conditions of the external system data bus and system address bus of the A processor. The timing for starting the write bus cycle for post-writing the write operand is controlled not by the instruction unit 910 but by the bus control 913 independently of the remaining units.
Moreover, similarly as the order of reading the read operands in advance is identical to the flow of the series instructions, the order of post-writing the write operands is identical to the flow of the series instruction.
In the A processor adopting the high-grade pipeline structure, as has been described hereinbefore, the advanced read of the read operands and the post-write of the write operands are carried out by the read temporary registers, write temporary register and control hardware of the bus control 913, and the instruction execution unit 910 and the bus control 913 having different instruction processing rates smoothly process together the instructions and improve the instruction processing efficiency. As has been described hereinbefore, moreover, the flow of the series instructions and the processing order of the instructions at the instruction execution unit 910 are identical, and the order of the advanced read of the read operands and the order of the post-write of the write operands are also identical to the flow of the series instructions. However, the bus control 913 for the memory access and I/O access for the advanced read of the read operands and for the post-write of the write operands is restricted by the internal and external statuses of the A processor so that the timing for the memory access or I/O access of the bus control 913 is not controlled by, but independent from, the instruction decoding unit 907 or the instruction execution unit 910 in the A processor. As a result, it is apparent that the read bus cycle for the advanced read of the read operands is conducted during a time period after the instruction code of the instruction requiring the read operand has been decoded in the instruction decoding unit 907 and before the execution of said instruction is finished, and that the write bus cycle for the post-write of the write operands is conducted after the execution of the instruction requiring the write operand has been finished. It is, however, impossible to predict when the bus cycle for reading a certain read operand in advance or writing a certain write operand later is started.
For the following description, two instructions of the assembly language of the A processor will be defined in connection with their mnemorics and operations.
Letters MOV d,s designate a transfer instruction, in which the letter d designates a destination operand whereas letter s designates a source operand so that the content of the operand designated at s is transferred to the operand designated at d. The destination operand d and the source operand s may be either register resources in the A processor or in the memory or I/O port. On the other hand, the destination operand d and the source operand s can be addressed either directly or indirectly by using a register. As a result, if an address in the memory is assigned as the source operand s, the transfer instruction MOV d,s is one requiring the read operand in the memory. If, on the other hand, an address in the memory is assigned as the destination operand d, the instruction MOV d,s is one requiring the write operand in memory.
Letters JUMP adr designate an unconditional branch instruction, in which the letters adr designate the branched address of said branch instruction. If the instruction code of the branch instruction JUMP adr is decoded by the instruction decoding unit 907 in the A processor, all the contents of the instruction queue of the instruction pre-fetch 905 are deleted in response to the command of the instruction decoding unit 907, and the value corresponding to the address adr is latched in the pre-fetching address register PAR of the bus control 913 so that this control 913 pre-fetch a new instruction code from the memory address indicated by the content of the register PAR.
FIG. 11 presents a portion of the program which is written by using the transfer instruction MOV d,s and the unconditional branch instruction JUMP adr. Letters L1, L2, . . . , and L10 designate line numbers, and operands M10, M11, M21, . . . , M100, and M101 are all in the memory. On the other hand, operands R2, R3, R6 and R7 are assigned to the registers in the instruction execution unit 910. The operands M10, M11, . . . , M100, and M101 may be addressed directly or indirectly.
Table 3 lists correspondences between the instruction mnemonics and codes used in FIG. 11.
TABLE 3 ______________________________________ Instruction Mnemonics Instruction Codes ______________________________________ MOV M10, M11 OPCD10 OPCD11 MOV R2, M21 OPCD20 OPCD21 MOV R3, M31 OPCD30 OPCD31 JUMP ADR60 OPCD40 OPCD41 MOV M50, M51 OPCD50 OPCD51 MOV R6, M61 OPCD60 OPCD61 MOV M70, R7 OPCD70 OPCD71 ______________________________________
In Table 3, for example, the instructions of the mnemonics MOV M10 and M11 have instruction codes OPCD10 and OPCD11. Here, these instruction codes OPCD10 and OPCD11 have binary patterns having a length of 1 byte. The transfer instructions MOV d,s and the unconditional branch instruction JUMP adr are assumed to have instruction codes of 2 bytes.
FIG. 12 is a diagram showing the location in the memory when the object codes obtained by assembling the program of FIG. 11 are stored in the memory. Letters ADR10, ADDR11, . . . , ADR100, and ADR101 appearing in FIG. 12 designate the addresses of the memory. This memory has a length of 1 byte. For example, it is found that the instruction code OPCD10 is latched in an area having the address ADR10 of the memory. The addresses ADR10 to ADR51 and the addresses ADR60 and later are continuous, but the addresses ADR51 and ADR60 are not always continuous.
FIG. 13 is a time chart showing the summary of the operation timings of the individual units inside of the A processor. The detail of the timing relationships among the gate delay, addresses and data are omitted from FIG. 13. There are presented in FIG. 13: time instant t1 to t17 at which actions are conducted at individual parts inside of the A processor; the order of decoding the instructions in the instruction decoding unit 907 and the outputs of the queue status 922, as indicated by the abbreviated symbols of Table 4; the changes in the contents of the two read operand address registers RAR1 and RAR2 and the write operand address register WAR in the address interface part of the bus control 913; and informations on the system address bus and system data bus outside of the A processor; and the changes in the contents of the two read operand data registers RDR1 and RDR2 and the write operand data register WDR of the data bus interface part inside of the A processor. As to the queue statuses, however, only the symbols F and E1 of Table 4 are presented whereas the other symbols are omitted.
TABLE 4 ______________________________________ QST Status of Instruction Queue 2 1 0 Status of Decoded Information Queue Symbol ______________________________________ 0 0 0 No Fluctuation of Instruction Queue and N Decoded Information Queue Head of Instruction Queue Located at 1st Byte of Instruction Code 0 0 1 No Fluctuation of Decoded Information F Head of Instruction Queue Located at 2nd and after Instruction Code 0 1 0 No Fluctuation of Decoded Information S Content of Instruction Queue Cleared 0 1 1 No Fluctuation of Decoded Information E1 Content of Instruction Queue Cleared 1 0 0 Content of Decoded Information Queue Cleared E2 ______________________________________
In the columns of the instruction decoding unit 907 and the instruction execution unit 910 I1, I2, I3, JUMP, I6 and I7 are symbols for discriminating the instructions presented in FIG. 11. For example, the symbol I1 represents a transfer instruction of the mnemonics MOV M10 and M11.
In the instruction decoding unit 907, at the time t1, (which will be likewise abbreviated to t1, t2, . . . , and so on), the instruction code of 2 bytes of the instruction I1 (which will be likewise abbreviated to I1, I2, . . . , and so on) is fetched sequentially from the head of the instruction queue and is decoded. Since the instruction I1 requires the read operand M11 in the memory, the instruction decoding unit 907 commands the address generator 911 to generate and transfer the address RA1 of the read operand M11 to the bus control 913, and commands the bus control 913 to latch the address RA1 transferred from the address generator 911 in the read operand address register RAR1 of the address bus interface part and to read in advance the data stored in the area of the address RA1 in the memory and to latch it in the read operand data register RDR1 of the data bus interface part. Since, on the other hand, the instruction I1 requires the write operand M10 in the memory, the instruction decoding unit 907 commands the address generator 911 to generate and transfer an address WA1 of the write operand M10 to the bus control 913, and commands the bus control 913 to latch the address transferred from the address generator 911 in the write operand address register WAR of the address bus interface part and to latch the data, which is transferred to and latched in the write operand data register WDR of the data bus interface part of the bus control 913 during the execution of the instruction I1 by the instruction execution unit 910, in the area of the address WA1 in the memory after the execution of the instruction I1 by the instruction execution unit 910. The decoding of the instruction I1 is finished by latching the information necessary for the processing of the instruction I1 at the instruction execution unit 910 in the tail of the decoded information queue 908. From and after the time t2, an instruction code of 2 bytes of the instruction I2 is fetched from the head of the instruction queue and is decoded. Since the instruction I2 requires the read operand M21 in the memory, like the instruction I1, the address RA2 of the read operand M21 is generated in the address generator 911 and latched in the address register RAR2 of the bus control 913, and it is commanded that the data be read in advance from the area of the address RA2 in the memory. The decoding of the instruction I2 terminates at the time t6. On and after the time t6, the instruction I3 is decoded. On and after the time t6, more specifically, the instruction code of 2 bytes of the instruction I3 is fetched from the head of the instruction queue to start its decoding. Since the instruction also requires the read operand M31 in the memory, Like the instructions I1 and I2, the address RA3 of the read operand M31 is generated in the address generator 911 and is latched in that one of the two read operand address registers in the bus control 913, which has already been subjected to the advanced read of the read operand so that its content has become unnecessary. In the case of this example, the address RA3 is latched in the address register RAR1. As will be described hereinafter, at the time t3, the bus control 913 uses the read operand address RA1 of the instruction I1, which has been latched in the register RAR1 at t1, to start the memory read bus cycle for the advanced read of the read operand of the instruction I1 and to latch the data RD1 of the address RA1 of the memory in the register RDR1. At t6, therefore, the content RA1 of the register RAR1 has already been unnecessary. Likewise, at t5, the content RA2 of the register RAR2 has been used to start the memory read bus cycle for the advanced read of the read operand of the instruction I2 so that the data RD2 of the address RA2 in the memory has been latched in the register RDR2. As a result, at t6, the content RA2 of the register RAR2 has already become unnecessary, and the register RAR2 itself has become substantially empty. At t9, the decoding of the instruction I3 is finished. On and after t9, the decoding of the instruction JUMP is started. On and after t9, more specifically, the instruction code of the instruction JUMP of 2 bytes is fetched sequentially from the head of the instruction queue and is decoded. This instruction JUMP is an unconditional branch instruction so that it makes all the contents of the instruction queue ineffective when its instruction code is decoded at the instruction decoding unit 907. Since the instruction code of the instruction JUMP has a branched address, the instruction decoding unit 907 decodes said branched address and commands the address generator 911 to transfer the information obtained by decoding said branched address thereby to generate and transfer said branched address to the bus control 913. The instruction decoding unit 907 further commands the bus control 913 to latch said branched address transferred from the address generator 911 in the pre-fetch address register PAR and then to conduct the pre-fetch of the instruction code from the content of the register PAR in precedence to the advanced read of the read operand and the post-write of the write operand. No execution of the instruction JUMP is at the instruction execution unit 910, but the information of "NO Execution" is latched as the decoded information of the instruction JUMP in the tail of the decoded information queue 908, thus finishing the decoding of the unconditional branch instruction JUMP at t12. From t12 to t13, the instruction decoding unit 907 interrupts its action and is held in a waiting status until the instruction code of one or more instructions is stored in the instruction queue emptied on the decoding of the instruction JUMP by the prefetch of a new instruction code of the instruction JUMP from the branch. On and after t13, the instruction I6 is decoded. The instruction code of the instruction I6 of 2 bytes is fetched from the head of the instruction queue and decoded, and the address RA6 for the read operand M61 of memory necessary for the instruction I6 is generated in the address generator 911 and latched in the register RAR2 of the bus control 913. Then, the advanced read of the read operand in the memory for the instruction I6 is commanded by the use of the content RA6 of the register RAR1. At t15, the decoding of the instruction I6 is finished by latching the information necessary for the execution of the instruction I6 by the instruction execution unit 910 in the tail of the decoded information queue 908. On the after t15, the decoding of the instruction I7 is started. This decoding of the instruction I7 is conducted by fetching the instruction code of the instruction I7 of 2 bytes from the head of the instruction queue. Since this instruction I7 requires the write operand M70 in the memory, the instruction decoding unit 907 commands the address generator 911 to generate and transfer an address WA7 for the write operand M70 to the bus control 913. This bus control 913 is also commanded to latch the address WA7 transferred from the address generator 911 in the write operand address register WAR and, after the processing of the instruction I7 at the instruction execution unit 910, to write the data transferred to and latched in the register WDR in the memory by the use of the address WA7 of the content of the address register WAR. Thus, in the instruction decoding unit 907, the decodings of the instructions I1, I2, I3, JUMP, I6 and I7 are started from t1, t2, t6, t9, t13 and t15, respectively. The queue status 922 outputs a 3-bit code represented by the symbol F of Table 4 to the outside of the A processor, when the first 1 byte of the instruction code of one instruction is fetched from the instruction queue to the instruction decoding unit 907, so that it represents the symbol F immediately after t1, t2, t6, t9, t13 and t15. If, on the other hand, the decoding of the unconditional branch instruction JUMP is started from t9 at the instruction decoding unit 907, all the contents of the instruction queue are made ineffective at t10 so that the 3-bit code corresponding to the symbol E1 of Table 4 is outputted as the queue status at t10 to the outside of the A processor.
At t4, the instruction executing unit 910 fetches the decoded information of the instruction I1, which has headed the decoded information queue 908 at t4, to execute the instruction I1. This instruction I1 is one for transferring the read operand M11 in the memory to the write operand M10 in the memory. The A processor has the advanced reading function of the read operands. In the example of FIG. 13, the memory read bus cycle for reading the read operand M11 of the A processor is started at t3 prior to the time t4 at which the execution of the instruction I1 is started at the instruction execution unit 910, and the data RD1 corresponding to the read operand M11 obtained for the period of said memory read bus cycle is latched in the register RDR1. As a result, the read operand M11 of the instruction I1 has already existed in the A processor when the execution of the instruction I1 is started at t4. Therefore, the execution of the instruction I1 need not access to the outside of the A processor for establishing the read operand M11, and this operand M11 can be obtained if the register RDR1 in the A processor is accessed to, to shorten the time period for executing the instruction I1. On the other hand, this instruction I1 requires the write operand M10 in the memory. Since, however, the A processor has the write operand post-writing function, the data WD1 of the instruction I1 corresponding to the write operand M10 outside of the A processor has been latched in the data register WDR of the A processor when the execution of the instruction I1 at the instruction execution unit 910 is finished at t7 in the example of FIG. 13. The write of the write operand M10 of the instruction I1 in the memory is performed as a result that the memory write bus cycle for writing out the write operand M10 of the instruction I1 at the outside of the A processor is started at t8 after t8 at which the execution of the instruction I1 is finished at the instruction execution unit 910, and that the data WD1 of the instruction I1 corresponding to the write operand M10, which is latched in the data register WDR of the A processor at t8, is latched in a predetermined area of the memory. As a result, the period for executing the instruction I1 at the instruction execution unit 910 does not contain the bus cycle period for reading out the read operand in the memory and writing out the write operand in the memory, but the registers RDR1 and WDR in the A processor may be accessed to if either the read operand or the write operand is required, so that the time period for executing the instruction I1 can be drastically shortened.
The execution of the instruction I2 at the instruction execution unit 910 is started from t7 and finished at t11. The read operand M21 in the memory, which is required by the instruction I2, is read at t5 prior to the time t7, at which the execution of the instruction I2 is started at the instruction execution unit 910, and is latched in the register RDR2 in the A processor. At t10 for the period of executing the instruction I2 at the instruction execution unit 910, there arises a situation, in which the instruction decoded by the instruction decoding unit 907 is the unconditional branch instruction JUMP so that all the contents of the instruction queue are made ineffective to start a new instruction code pre-fetching action from the branch designated in the instruction code of the instruction JUMP. That situation, however, exerts no influence upon the execution of the instruction I2 at the instruction execution unit 910.
The execution of the instruction I3 at the instruction execution unit 910 is started at t11 and is finished at t17. Since the pre-fetching action from a new instruction code from a branch, which is started at t10 and caused by the decoding of the instruction JUMP, is allowed to continue to t13 by the highest precedence for occupying the external bus, the advanced read of the read operand M31 in the memory required by the instruction I3 cannot be conducted before the time t11 at which the execution of the instruction I3 is started at the instruction execution unit 910. Therefore, the execution of the instruction I3 at the instruction execution unit 910 is deferred, until the read operand advanced read of the instruction I3 acquires the bus occupation at t14 to start the memory read bus cycle for reading the read operand M31 to latch a data RD3 corresponding to the operand M31 from the memory in the register RDR1 in the A processor so that the instruction execution unit 910 can access to the register RDR1. After this, the data RD3 of the instruction I3 corresponding to the read operand M31 is fetched from the register RDR1 to the instruction execution unit 910 to execute the instruction I3, thus finishing the execution of the instruction I3 at t17.
The execution of the instruction JUMP at the instruction execution unit 910 is started at t17. This execution of the instruction JUMP requires no substantial action. The memory read bus cycle for reading the read operand M61 of the memory in advance, which is required by the instruction I6 at t16, is started so that the data RA6 corresponding to the read operand M61 is lathed in the register RDR2 of the A processor.
When the instruction decoding is conducted at the instruction decoding unit 907 in the order of the instructions I1, I2, I3, . . . , and so on, the instruction execution is also conducted at the instruction execution unit 910 in the order of the instructions I1, I2, I3, . . . , and so on. For example, however, the time period from the time t1, at which the decoding of the instruction I1 is started at the instruction decoding unit 907, to the time t4, at which the execution of the instruction I1 is started at the instruction execution unit 190, is determined depending upon the several internal and external conditions of the A processor and is generally difficult to predict. If, on the other hand, the instructions requiring the operand outside of the A processor such as a memory are decoded in the order of I1, I2 and I3 at the instruction decoding unit 907, the bus control 913 accesses to the outside of the A processor in the order of the accesses to the operands of the instructions I1, I2, and I3. For example, all the instructions I1, I2 and I3 require the read operands RA1, RA2 and RA3 in the memory, and the memory read bus cycle is started at the bus control 913 in the order of the read operands RA1, RA2 and RA3. As it will be understood from the foregoing description that the memory read bus cycle for the read operand of the instruction I3 is started at t14 after the time t11 at which the execution of the instruction I3 is started at the instruction execution unit 910, the time, at which the bus cycle for the access of the read operand of the instruction requiring the read operand outside of the A processor intervenes between the time of starting the decoding of said instruction at the instruction decoding unit 907 and the time of finishing the execution of said instruction, but is influenced by the several internal and external conditions of the A processor so that it is generally difficult to predict. This difficulty likewise applies to the case of the instruction requiring the write operand outside of the A processor. Specifically, the time, at which the bus cycle for the access of the write operand of said instruction is started, i.e., the time t8, at which the bus cycle for the access of the write operand M10 of the instruction I1 is started, comes after the time of finishing the execution of said instruction at the instruction execution unit 910, but is influenced by the several internal and external conditions of the A processor so that it is also generally difficult to predict.
The A processor implementing the high-grade pipeline structure has been described hereinbefore in connection with the summary of the hardware of the A processor and the summary of the actions of the individual units inside of the A processor, especially, the advanced instruction reading function of the A processor, the advanced reading function of the read operands, and the post-writing function of the write operands.
Next, let it be considered to develop a software by the use of the A processor. When a software was to be developed by the use of the aforementioned processor, the software being developed was debugged by the instruction tracing method. The software to be developed by the A processor is also debugged like the aforementioned processor by using the instruction tracing method and the instruction tracer to improve its developing efficiency. The instruction tracing method of the A processor is similar to the aforementioned processor instruction tracing method. The instruction tracer for the A processor is required to have a trace buffer memory break point setting function to sequentially store the individual frames, which are formed by sampling the information on the address bus 902, the information on the data bus 903, the necessary control signal and the status signal while the A processor is executing a series of instructions, as time-series data, and to a function to edit the information in the trace buffer memory. The break point is set by the use of the instruction tracer for the A processor so that the A processor is caused to fetch each frame when a new information is outputted to the address bus 902 or when the status signal changes, for example. After the A processor have its operations interrupted at the break point, it edits the content of the trace buffer memory to establish the mnemonic of the instructions in the order of its actual execution. For the memory and the instructions having been subjected to the I/O access, on the other hand, the A processor has to present the user the actually accessed addresses and data in a manner to correspond to the instruction mnemonics.
Assuming that the program shown in FIG. 11 is assembled and stored in the memory, as shown in FIG. 12, the description to be made is directed to the case in which the instruction tracing is to be conducted by the use of the instruction tracer for the A processor when the A processor executes a series of instructions in the vicinity of the memory address ADR10 appearing in FIG. 12. Table 5 presents a dump list of a portion of the content of the trace buffer memory of the instruction tracer for the A processor.
TABLE 5 __________________________________________________________________________ FRAME ADDRESS DATA BUSSTS QSTS4 QSTS3-0 FIG. 3 __________________________________________________________________________ 001 F t2 002 S 003 RA1 MR t3 004 RD1 MR 005 H t4 006 ADR41 F 007 OPCD41 F 008 ADR50 F 009 OPCD50 F 010 RA2 MR t5 011 RD2 MR 012 F t6 013 S 014 H t7 015 ADR51 F 016 OPCD51 F 017 WA1 MW t8 018 WD1 MW 019 F t9 020 S 021 E1 t10 022 H t11 023 ADR60 F 024 OPCD60 F 025 ADR61 F 026 OPCD61 F 027 ADR70 F 028 OPCD70 F 029 ADR71 F 030 OPCD71 F 031 F t13 032 S 033 RA3 MR t14 034 RD3 MR 035 F t15 036 S 037 RA6 MR t16 038 H t17 039 RD6 MR __________________________________________________________________________
TABLE 6 __________________________________________________________________________ MMI0 RDWR BST2 BST1 BST0 Symbol __________________________________________________________________________ 0 0 0 0 0 I/O Write Access LOW 0 1 0 0 0 I/O Read Access IOR 1 0 0 0 0 Memory Write Access MW 1 1 0 0 0 Memory Read Access MR 1 1 0 0 1 Inst. Code Fetch F 0 0 0 1 0 Hold Acknowledge HLTA 0 0 0 1 1 Interrupt Acknowledge INTA 0 2 1 1 1 Idle State IDL __________________________________________________________________________
In Table 5, addresses RA1, ADR41, . . . , and so on appearing in the ADDRESS column are those outputted onto the address bus 902 by the A processor, and data RD1, OPCD41, . . . , and so on of the DATA column are those appearing on the data bus 903 for the input/output of the A processor. The BUS-STS column represents the bus status code, MMIO signal, RD and WR signals, which are outputted to the bus status terminal by the A processor, simply in the symbols F, MR and MW, as presented in Table 6. Letters 11 and L appearing in the QSTS4 column represent whether the signals outputted to the QSTS4 terminal by the A processor was at the high (H) or low (L) level. The QSTS3-0 column represents the queue status signals, which are outputted to the queue status terminal by the A processor, simply in the symbols F, S and E1, as presented in Table 4. The righthand end column is provided to present not the content of the trace buffer memory but the correspondence between FIG. 13 and Table 5, in which the symbol t2 corresponds to the time t2 of FIG. 13. First of all, the method of establishing the instruction mnemonics in the order of the actual execution of the instructions by the A processor will be briefly described with reference to the dump list of the content of the trace buffer memory, as listed in Table 5.
Like the processor instruction tracing method, the queue status selects as the present instruction tracing reference the instant indicating that all the contents of the instruction queue are purged. Since the QSTS3-0 column presents E1 (i.e., the purge of the content of the instruction queue) in the frame 021, as seen from Table 5, the frame is selected as the instruction tracing reference from the trace buffer memory of Table 5. It is at the frame 031 that the QSTS3-0 column first represents F after it exhibited E1 at the frame 021, and it is at the frame 035 that the QSTS3-0 column represents F subsequent to the frame 031. Since it is only at the frame 032 that the QSTS3-0 column represents S between the frames 031 and 035, the instruction having the first byte of its instruction code fetched from the instruction queue and transferred to the instruction decoding unit 907 at the frame 031 is one having an instruction code of a 2-byte length. The instruction having the first byte of its instruction code fetched from the instruction queue at the frame 031 is denoted at I. Next, the instruction code of this instruction I is selected. After all the contents of the instruction queue have been purged at the frame 021, the first instruction code fetch takes place at the frames 023 and 024. The instruction code outputted from an area in the memory corresponding to the pre-fetching address outputted at the frame 023 is OPCD60 of the DATA column at the frame 024. From the frames 023 and 024, it is found that the instruction code of the first byte of the instruction I is OPCD60. The instruction code fetch subsequent to the frames 023 and 024 is conducted at the frames 025 and 026, and the instruction code OPCD61 is fetched in the A processor. As a result, the instruction code having the length of 2 bytes of the instruction I is OPCD60 and OPCD61. Since the mnemonics MOV M60 and M61 are obtained by inversely assembling the instruction codes OPCD60 and OPCD61 of the 3-byte length, the instruction mnemonics of the instruction I are MOV M60 and M61. From now on, operations similar to those described above are repeated. The instruction having the first and second bytes of its instruction code fetched from the instruction queue at the frames 035 and 036, respectively, is one having an instruction code of the 2-byte length, and the instruction codes are those resulting from the twice pre-fetches subsequent to the pre-fetch of the instruction I, i.e., at the frames 027 and 028, and 029 and 030 so that they are OPCD70 and OPCD71, which can be inversely assembled to establish MOV M70 and M71.
Next, in case the instruction executed by the A processor has an operand in the memory or I/O outside of the A processor, the real value of said operand, i.e., what data is accessed to and for which address the access is conducted has to be traced. This will be described in the following by taking up the instruction I as an example. The instruction I is a transfer one requiring the read operand M61 and the write operand M60 in the memory. The read operand of the instruction is read out from the memory by the advanced reading function of the read operand of the A processor after the instruction has been decoded at the instruction decoding unit 907 and before the end of the instruction execution of the instruction I at the instruction execution unit 910. If, however, the address of the read operand of MOV M60 and M61 of the instruction I is obtained by an indirect addressing, its value will not appear positive in the instruction code of the instruction I. This makes it impossible to identify which frame of the several frames exhibiting the memory read (as indicated by MR in the BUS-STS column) in the dump list of Table 5 is conducting the memory read for reading the read operand of the instruction. Table 5 has four memory reads at the frames 003 and 004, at the frames 010 and 011, at the frames 033 and 034, and at the frames 037 and 039. Since the memory reads at the frames 003 and 004 and at the frames 010 and 011 fall prior to the frame 031 at which the instruction of the instruction I is to be decoded, it is apparent that the memory reads at the frames 003 and 004 and at the frames 010 and 011 do not belong to the memory read bus cycle for reading the read operands of the instruction I from the memory. The frames of both the memory reads at the frames 033 and 034 and at the frames 037 and 039, which are subsequent to the frame 031 for decoding the instruction I, and the memory reads, which will appear on and after the frame 039, although not presented in Table 5, may possibly belong to the memory read bus cycle for reading out the read operands of the instruction I from the memory. It is neither possible to identify which of the frames belongs to the memory read bus cycle for the read operands of the instruction I. As a result, although the mnemonics of the instruction I could be obtained, the operations of the instruction I cannot be completely restored because there is no information indicating what value the instruction I deduced from a read operand and which address the instruction I took that value from.
There has been described hereinbefore the instruction tracing method of the prior art, in which it is necessary and sufficient to output both the information indicating the status of the instruction code pre-fetch queue inside of the information processing system having the instruction code pre-fetching function only and the information indicating the kinds of the bus cycles from the information processing system to the outside of said system so that the instruction execution may be traced by the means disposed outside of said system. The instruction tracing method thus far described, however, is accompanied by a drawback that it is ineffective for a high-performance information processing system which has the multi-stage pipeline structure and has the instruction code pre-fetching function, the read operand advanced-reading function, and the write operand post-writing function.