1. Field of the Invention
The present invention relates to a bus system and, particularly, to a technique of controlling a wait operation of a bus master.
2. Description of Related Art
Various methods for improving the performance of computers have been introduced. Pipelining is used as a technique to improve the performance by executing a plurality of instruction processing in parallel. Specifically, one instruction processing is broken into two or more stages so that each stage of a plurality of instructions can be processed in parallel.
In a bus system using the pipeline, an access from a bus master such as a processor and a DMAC (Direct Memory Access Controller) to a resource of a bus slave is also composed of a plurality of stages. FIG. 6 shows an example of the stages of processing that a bus master performs reading (loading) from a register of a bus slave. A bus master is described as a CPU by way of illustration.
As shown in the example of FIG. 6, the processing that a CPU executes the instruction of loading data from a bus slave is composed of six stages: IF, ID, EX, DF, CM and WB. These stages are: a stage to fetch an instruction into a CPU, a stage to decode the instruction, a stage to execute the instruction (in this example, to send a data loading request as an access request to a bus slave), a stage to obtain an operation result (in this example, to receive a response from a bus slave), a stage to complete the execution, and a stage to write back or to update a resource such as a register of the CPU, respectively.
A bus slave is not always able to perform the processing in response to an access request from a bus master. Thus, in the DF stage to obtain a response from a bus slave, the CPU can receive a ready response indicating that a bus slave is ready to process a request or a wait response indicating that it is not ready. If the CPU receives the wait response, it suspends the execution of an instruction until the stage of the bus slave becomes ready.
In addition to the above two responses, a bus slave can send a response indicating the occurrence of an exception. If the exception is detected in the CM stage to complete the execution, a CPU interrupts the processing prior to the CM stage and performs the exception handling. The processing is not interrupted in the WB stage to perform write-back because the execution of an instruction is fixed in this stage.
The CPU operation after receiving the wait response until the state of the bus slave becomes ready is either a non-blocking wait operation or a blocking wait operation. These two operations are described hereinafter in detail with reference to FIGS. 7 and 8.
In the non-blocking wait operation, the execution of an instruction for which a wait response is made is suspended until the wait is released, and other instructions subsequent to this instruction are executed without being suspended. FIG. 7 shows a case where the wait response is received in the DF stage of the instructions 1 and 4 out of the instructions 1 to 5, and the CPU performs the non-blocking wait operation in response thereto. For simplification of the description, the non-blocking wait operation and the blocking wait operation are also referred to hereinafter simply as non-blocking operation and blocking operation, respectively.
The charts A, B, C and D in FIG. 7 show the timing for a CPU to execute each stage of the instructions 1 to 5, the timing of signals related to an access request output from the CPU in the EX stage of the instructions 1 and 4, the timing for the CPU to retrieve data when a ready response is sent from a bus slave or an exception (exp) occurs in the DF stage of the instructions 1 and 4, and the timing of a response from a bus slave in the DF stage of the instructions 1 and 4, respectively. In these charts, the first stage (IF) in the instruction 1 is shown as the first cycle. Each timing in the charts B, C and D corresponds to each cycle shown in the chart A.
As shown in the chart A of FIG. 7, the CPU executes the EX stage of the instruction 1 in the third cycle and sends an access request to a bus slave. The access request contains a signal REQ to request processing to a the bus slave, an address ADS to be accessed, and a content CMD of the processing requested to the bus slave, which is loading for a load instruction, as shown in the chart B of FIG. 7. In the boxes of REQ in the chart B, the High-level line indicates that a REQ signal is active.
The bus slave responds to the access request in the EX stage of the instruction 1 where the REQ signal in the chart B is active. In this example, the bus slave sends a wait response (WAIT) in the fourth and fifth cycles and then sends a ready response (RDY) in the sixth cycle (cf. the chart D). The CPU suspends the execution of the instruction and then retrieves data D0 from the bus slave in the sixth cycle in response to the ready response (cf. the chart C). After that, the CPU carries out the CM stage to complete the execution and the WB stage to write-back, thereby ending the instruction 1.
The execution of the instructions 2 and 3 is started with a delay of one cycle each from the start of the execution of the instruction 1 (the first cycle). As shown in FIG. 7, while the wait response is made in the DF stage of the instruction 1 (the fourth and fifth cycles), the EX stage and the DF stage of the instruction 2 and the ID stage and the EX stage of the instruction 3 are executed. Because the CPU performs the non-blocking operation when the wait response is made in the DF stage of the instruction 1, the subsequent instructions 2 and 3 are executed.
This is the same for the instructions 4 and 5, and the IF stage and the ID stage of the instruction 4 and the IF stage of the instruction 5 are executed in the fourth and fifth cycles.
For the instruction 4, the CPU sends an access request (cf. the chart B) to a bus slave in the EX stage of the sixth cycle and receives a wait response (cf. the chart D) in DF stage of the seventh and eighth cycles. Then, an exception response (cf. EXP in the chart D) is sent from the bus slave in the ninth cycle. Although the CPU retrieves the data D1 from the bus slave at the same time as receiving the exception response, it performs the exception handling for “abnormal completion due to an exception” in the CM stage of the tenth cycle. In such a case, the WB stage of the instruction 4 is not executed as shown in the chart A, and no write-back of the data D1 which is retrieved in the ninth cycle is performed in the WB stage of the eleventh cycle.
In the DF stage of the instruction 4, because the CPU performs the non-blocking operation, the stage of the subsequent instruction 5 is executed in the seventh and eighth cycles where a wait response is made.
As described above, if the CPU performs the non-blocking operation, the subsequent instruction is executed, which contributes to higher processing performance of the entire processing system.
However, if exception handling occurs by the execution of the instruction 4 as shown in FIG. 7, the execution of the subsequent instruction 5 is completed already when the exception handling is completed in the tenth cycle. Accordingly, if the instruction 4 is executed again, the instruction 5 is executed twice. Thus, if the CPU performs the non-blocking operation, the reexecution of the instruction 4 where an exception occurs is difficult.
On the other hand, in the blocking wait operation, the execution of the instruction for which a wait response is made is suspended until the wait is released, and other instructions subsequent to that instruction are also suspended. FIG. 8 shows a case where the wait response is made in the DF stage of the instructions 1 and 4 out of the instructions 1 to 5, and the CPU performs the blocking operation in response thereto.
The charts A, B, C and D in FIG. 8 are the timing charts when the CPU executes each stage of the instructions 1 to 5, the contents of a request which is output from the CPU in the EX stage of the instructions 1 and 4, data that is retrieved by the CPU when the bus slave becomes ready or exception (exp) in the DF stage of the instructions 1 and 4, and a response from the bus slave in the DF stage of the instructions 1 and 4, respectively.
A bus slave responds to the access request in the EX stage of the instruction 1. In this example, like the example in FIG. 7, the bus slave sends a wait response in the fourth and fifth cycles and then sends a ready response (RDY in the chart D) in the sixth cycle. The CPU suspends the execution of the instruction and then retrieves data D0 from the bus slave in the sixth cycle in response to the ready response (cf. the chart C). After that, the CPU carries out the CM stage for completion and the WB stage for write-back in the seventh and eighth cycles, respectively, thereby ending the instruction 1.
The execution of the instructions 2 and 3 is started with a delay of one cycle each from the start of the execution of the instruction 1 (the first cycle). As shown in FIG. 8, while the wait response is made in the DF stage of the instruction 1 (the fourth and fifth cycles), the EX stage of the instruction 2 and the ID stage of the instruction 3 are also suspended without being executed. Because the CPU performs the blocking operation in the fourth and fifth cycles until the wait is released in the DF stage of the instruction 1, the subsequent instructions 2 and 3 are also suspended.
This is the same for the instruction 4, and the IF stage is suspended in the fourth and fifth cycles.
For the instruction 4, the CPU sends an access request (cf. the chart B) to a bus slave in the EX stage of the eighth cycle and receives a wait response (cf. the chart D) in DF stage of the ninth and tenth cycles. Then, an exception response (cf. the chart D) is sent from the bus slave in the eleventh cycle. Although the CPU retrieves the data D1 at the same time as receiving the exception response, it performs the exception handling for “abnormal completion due to an exception” in the CM stage of the twelfth cycle. In such a case, the WB stage of the instruction 4 is not executed as shown in the chart A.
In the DF stage of the instruction 4, because the CPU performs the blocking operation, the EX stage of the subsequent instruction 5 is suspended until the wait is released in the ninth and tenth cycles where a wait response is made.
As described above, if the CPU performs the blocking operation, the subsequent instruction is not executed, which results in lower processing performance of the entire processing system compared with the case where the CPU performs the non-blocking operation.
However, if exception handling occurs as in the instruction 4 of the example shown in FIG. 8, the execution of the subsequent instruction 5 is not yet completed when the exception handling is completed in the twelfth cycle. Therefore, even if the instruction 4 is reexecuted, the instruction 5 is not executed twice. Thus, if the CPU performs the blocking operation, it is possible to reexecute the instruction where an exception occurs.
As described above, there are both advantages and disadvantages when a CPU performs the blocking operation or the non-blocking operation upon receiving a wait response.
A known technique to determine whether a CPU performs the blocking operation or the non-blocking operation, which is referred to hereinafter as a first technique, is as follows.
As shown in FIG. 9, according to this technique, an address space of a bus master is divided into an address space for blocking operation (which is referred to hereinafter as a blocking operation space) and an address space for non-blocking operation (hereinafter as a non-blocking operation space). The bus master performs the blocking operation for an access to a bus slave which is connected with the blocking operation space and performs the non-blocking operation for an access to a bus slave which is connected with the non-blocking operation space. Because the bus slave illustrated in FIG. 9 is connected with the blocking operation space of the bus master, the bus master performs the blocking operation when accessing a resource, which is a register in the example of FIG. 9, of the bus slave.
Another technique, which is referred to hereinafter as a second technique, is disclosed in Japanese Unexamined Patent Application Publication No. 04-372018. According to this technique, a flag indicating whether or not to accept a wait signal is set in an instruction structure of a processor. Upon receiving a wait signal during the pipeline operation, a flag in the instruction structure corresponding to the wait signal is referred. If it is the instruction which accepts a wait signal, the next cycle is executed without waiting for the completion of the current cycle. If it is the instruction which does not accept a wait signal, the next cycle is executed after the completion of the current cycle.
We have now discovered that one bus slave does not always have one resource only. Further, not all of a plurality of resources of one bus slave always require the blocking operation. For example, if only the register D out of the four registers A to D of the bus slave shown in FIG. 9 requires the blocking operation, it is necessary to connect the bus slave to the blocking operation space of the bus master in spite that the other three registers do not require the blocking operation according to the first technique described above. As a result, the bus master carries out the blocking operation when accessing the other three registers also, which decreases the performance of the system as a whole.
We have also discovered that the second technique divides the instruction of the bus master into the blocking operation instruction and the non-blocking operation instruction in essential. The bus master refers to a flag which is set in an instruction to indicate whether the instruction is a blocking instruction or a non-blocking instruction and performs the blocking operation when executing the blocking instruction and performs the non-blocking operation when executing the non-blocking instruction.
Because this technique sets a flag indicating whether an instruction is a blocking instruction or a non-blocking instruction into the instruction structure to be executed by a bus master to thereby determine if the bus master performs the blocking operation or the non-blocking operation, it is necessary to rewrite program codes in order to switch the two operations. Thus, it costs for the operation to find a switching point to rewrite program codes and also costs for the operation to check the occurrence of degradation due to a change to the program codes. For these reasons, it is necessary to modify program codes for the switching of the blocking operation and the non-blocking operation of a bus master, which hinders the dynamic switching.