The present invention relates to a system having a read-modify-write instructions installed therein.
Especially, this invention relates to a system having a read-modify-write unit for performing read-modify-write operations, in addition to a central processing unit (abbreviated into CPU hereinafter).
Moreover, the present invention relates to a system having a read-modify-write unit and also a digital signal processing (abbreviated into DSP hereinafter) unit, applicable, not only to CPU, but to a digital signal processor, etc., for performing a series of read-modify-write operations, like a CPU.
Microprocessors usually have a CPU incorporating an operation unit and a controller mounted on a silicon chip with LSI production technology. A computer system has such a microprocessor and memories connected thereto. Current microcomputers have memories also mounted on a microprocessor chip in which access is made between the microprocessor and memories via a bus interface unit (abbreviated into BIU hereinafter).
Shown in FIG. 30 is known microprocessor architecture with a CPU 1, a memory 2 and a BIU 3 interposed therebetween.
The CPU 1 executes a read bus cycle, a write bus cycle or a dummy bus cycle (not read nor write) to the memory 2 via the BIU 3. In detail, the BIU 3 receives a memory address, a read or a write request and write data, data to be written (in writing only) from the CPU 1, while passes read data, data read from the memory 2 to the CPU 1.
FIG. 30 illustrates direct access to the memory 2 by the BIU 3. Not only that, a memory controller (depending on the type of memory 2) may be provided between the BIU 3 and the memory 2, for read and write operations to the memory via the memory controller.
A read-modify-write instruction is explained in detail. This instruction is a single instruction for CPU to execute a series of operations to read data from a memory, modify some or all bits of the read data and then rewrite the original data with the modified data in the memory.
The read-modify-write instruction is used, for example, for bit manipulation to 1-bit data, such as, bit set, bit clear, bit inversion and bit logical operations (LOGICAL ORAND, NOR, etc) and also for bit field manipulation to 2-bit data or more, such as, arithmetic operations, logical operations, shift/rotation, insertion/replacement and clear/set.
Explained next is the bit manipulation and bit field manipulation to an 8-bit memory.
In the following explanation, the most-significant bit and the least-significant bit are defined as bit 7 and bit 0, respectively, for a binary-digit data “10101010” stored in the 8-bit memory at a given address.
The bit manipulation will be explained first for four cases.
Bit set to bit 2 replaces the value “0” of bit 2 in “10101010” with “1”, thus “10101110” being written in the 8-bit memory.
Bit clear to bit 7 clears the value “1” of bit 7 from “10101010” to have the value “0” for bit 7, thus “00101010” being written in the 8-bit memory.
Bit inversion to bit 2 inverts the value “0” of bit 2 to “1” in “10101010”, thus “10101110” being written in the 8-bit memory.
Bit-LOGICAL OR operation to bit 2 applies a logical OR between the value “1” and “0” of bit 2 in “10101010” to have the result “1”, thus “10101110” being written in the 8-bit memory.
Next, the bit field manipulation is explained for seven cases.
Add operation to 3-bit data “010” of bits 6 to 4 in “10101010” with the value “110” gives the value “000” as the lower three bits of the results (sum), that is, “010+110=1000”, thus “10001010” being written in the 8-bit memory.
Subtract (or decrement) operation to 3-bit data “010” of bits 6 to 4 in “10101010” from a value “1”, that is, “010−1=001” results in “10011010” which is written in the 8-bit memory.
An EOR operation, an exclusive logical OR, with the value “110” to 3-bit data “010” of bits 6 to 4 in “10101010” gives the value “100”, thus “11001010” being written in the 8-bit memory.
Operational 1-bit right shift to 3-bit data “010” of bits 6 to 4 in “10101010” gives the value “001”, thus “10011010” being written in the 8-bit memory.
1-bit right rotation to 4-bit data “0101” of bits 6 to 3 in “10101010” rotates the value “0101” to the value “1010”, thus “11010010” being written in the 8-bit memory.
Bit field insertion of the value “1101” to 4-bit data “0101” of bits 6 to 3 in “101010101” replaces the value “1010” with “1101”, thus “11101010” being written in the 8-bit memory.
Bit field clear to 4-bit data “0101” of bits 6 to 3 in “10101010” results in “10000010” which is written in the 8-bit memory.
Execution of data read from a memory, data modify to the read data and data rewrite of the modified data to the memory with different instructions while interrupted by another task between these instructions could cause interruption before data modify or data rewrite to data to be rewritten. Such interruption could cause adverse consequences to system operations, such as, data look-up before data rewrite.
To avoid such adverse consequences, a single read-modify-write instruction to execute read data, data modify and data write is required.
The adverse consequences discussed above are, for example, as follows:
Suppose that data for discriminating between a processing mode and a waiting mode are stored in a memory at certain addresses for two apparatus A and B, in which data “10” indicates that the apparatus A is in the processing mode, data “11” indicates that the apparatus B is in the processing mode and data “00” indicates that both of the apparatus A and B are in the waiting mode.
The discriminating data is then read from the memory for the apparatus A to execute a specific processing. The data is rewritten as “10” if it is “00” and restored, thus the apparatus A starts the processing. On the contrary, if the data is not “00”, the apparatus A waits until the data changes to “00”. When the apparatus A completes the processing, the data “00” is written in the memory.
Like the apparatus A, the apparatus B waits until the data changes to “00”. The data “11” is then written so that the apparatus B can start a specific processing. The data “00” is also written when the apparatus B completes the processing.
Read-modify-write processing is thus required for such a system in which the apparatus A and B are not allowed to simultaneously start processing so that processing will not be interrupt.
When the read-modify-write processing is broken in while data read from the memory by the apparatus A has been “00”, interruption occurs during data-“10” writing for the apparatus A to start processing. The interruption forces the apparatus B to read the data from the memory, thus the data “00” is read and hence the data “11” is written for the apparatus B to start processing instead of the apparatus A.
When the interruption completes and the data “10” is written, the apparatus A starts processing even though the apparatus B is still in the processing mode.
As discussed above, the apparatus A and B could suffer consequences in processing to be switched between the apparatus due to the interruption.
The read-modify-write instruction includes bit manipulation, such as, bit set, bit clear, bit insertion, bit logical operations and bit inversion; bit field manipulation, such as, bit field insertion and bit field replacements; shift operations, such as, operation shift and logical shift; and add/subtract operations, such as, increment/decrement, as discussed above.
Installation of the read-modify-write instruction in microprocessors, microcomputers and DSPs, etc., requires processing time over one machine cycle for each of instruction fetch, instruction decode, memory read, data modify and memory write.
Illustrated in FIG. 26 is instruction execution in a first known CISC (Complex Instruction Set Computer) processor with no pipelined processing.
A read-modify-write instruction (INSTRUCTION 2 in FIG. 26) requires at least 5-machine-cycle instruction-execution time for instruction fetch “F”, instruction decode “D”, memory read “rd”, data modify “mo”, and memory write “wr”.
Illustrated in FIG. 27 is instruction execution in a second known CISC processor with pipelined processing.
Some instructions in a read-modify-write instruction can be executed in parallel with other preceding and succeeding instructions although it requires 5-machine-cycle instruction-execution time for instruction fetch “F”, instruction decode “D”, memory read “rd”, data modify “mo” and memory write “wr”.
In detail, the F- and D-stages are executed while the preceding instructions “E” and “W” (INSTRUCTION 1) are being executed, and the mo- and wr-stages are executed while the succeeding instructions “F” and “D” (INSTRUCTION 3) are being executed. Therefore, the read-modify-write instruction is executed as if it runs for 3 machine cycles.
FIG. 28 illustrates instruction execution in a third known RISC (Reduced Instruction Set Computer) processor with 5-stage pipelined processing.
A pipeline has 5 stage of instruction fetch “F”, instruction decode “D”, computation execute “E”, memory access “M” and register write “W”.
Two types of processing are performed at the memory-access “M” stage; memory read at the initial M stage; and memory write at the next M stage. Modify processing is performed when the pipeline processing returns to the instruction-decode “D” and computation-execute “E” stages.
The pipeline processing is performed in the order of instruction fetch “F”, instruction decode “D”, computation execute “E”, memory access “M” with memory read, instruction decode “D”, computation execute “E” with data modify, memory access “M” with memory write and register write “W”.
While the pipeline processing is being returned, an instruction following the read-modify-write instruction is stalled before the pipeline processing enters into an instruction-decode stage. The read-modify-write instruction illustrated in FIG. 28 thus seems to have 4 machine cycles. The third known process or consumes 2 stages for data modify in returning pipeline processing.
FIG. 29 illustrates read-modify-write processing for a fourth known processor that corresponds to the third known processor (FIG. 28) but to a memory with relatively slow read-write processing (such as a memory requiring 2 machine cycles for each of read and write).
The read-modify-write instruction illustrated in FIG. 29 requires 5 machine cycles because a memory-access “M” stage requires at least 2 machine cycles.
The slower the read-write processing for a memory used, the larger the number of machine cycles to be used for execution of the read-modify-write instruction, that is, machine cycles for the read-modify-write instruction=machine cycles for memory read+machine cycles for data modify+machine cycles for instruction execution.
As discussed above, the read-modify-write instructions in the known first to the fourth known processors are relatively slow instructions requiring at least 3 to 5 machine cycles.
The read-modify-write instruction requires longer execution time for slower processing-speed memories.
Read-modify-write instructions, such as bit manipulation, usually occupy 10% to 15% of programs installed in electrical household appliances, such as air conditioners and digital camcorders, and AV (Audio-Visual) equipment, such as CD players, DVD players, TVs and VCRs. Instructions of slow execution speed but often used will cause low processor performances.
A Read-modify-write-controlled system disclosed in Japanese Unexamined Patent publication No. 11-184761 has read-modify-write functions. The read-modify-write processing is performed simultaneously or in parallel to several memory banks. This is different from the present invention in which CPU instructions are executed in parallel with the preceding read-modify-write processing for higher throughput.
Recent program-implemented equipment have become complex in processing. Moreover, there are demands for higher processing speed and/or lower power consumption. Higher system performance, or smaller number of clocks per instruction (abbreviated into CPI hereinafter) for each CPU instruction is strongly desired.
Pipeline processing has been advanced for smaller CPI to meet the demands, however, obstructed by read-modify-write instructions such as bit manipulation due to 3 machine cycles or more in CPI.
Moreover, slow access-time memories affect CPI in read-modify-write operations. For, example, a 2 machine-cycle access-time memory requires 5 (=2+2+1) machine cycles in CPI for read-modify-write instructions.
As discussed above, the known processors have long apparent execution time for read-modify-write instructions. Program-implemented equipment using many read-modify-write instructions thus have longer apparent execution time which give adverse affects to performances, particularly, of CPU-embedded system.