FIG. 1 (Prior Art) is a simplified diagram of a portion of a commercially available 8-bit microcontroller integrated circuit. The microcontroller integrated circuit includes an amount of program memory 1, a processor portion 2, and an amount of data memory 3. Processor portion 2 includes a program counter 4, a fetch unit 5, an instruction buffer 6, an execution unit 7, and an 8-bit arithmetic logic unit (ALU) 8. ALU 8 receives two 8-bit operand values, performs an arithmetic or logic operation on the two 8-bit operand values, and outputs an 8-bit result. The first operand may be called the destination operand. The second operand may be called the source operand. An example of such an 8-bit operation is the addition of two 8-bit values.
In addition to being able to perform 8-bit operations, it is also often desired to perform arithmetic and logic operations on values having more than 8-bits of precision. FIG. 2 (Prior Art) is a diagram that illustrates four instructions that might be executed by the microcontroller of FIG. 1 in order to perform a 32-bit ADD operation. The instructions of FIG. 2 are stored in consecutive 8-bit memory locations in program memory. The first instruction occupies two 8-bit memory locations (two bytes) in program memory. The first byte stores the equivalent of a hexadecimal “02”. The second byte stores the equivalent of a hexadecimal “37”. It is therefore seen that the four instructions of FIG. 1 occupy a total of eight bytes of program memory.
To perform the 32-bit ADD operation in this example, fetch unit 5 fetches the first byte (02) of the first instruction (the ADD instruction) out of program memory 1, across the eight-bit data bus 9, and into the FetchData inputs of fetch unit 5. Fetch unit 5 latches the first byte into the Opcode portion 10 of the instruction buffer 6. This byte is then used to determine the number of additional bytes that may need to be fetched in order to complete the fetching of the instruction. Depending on the type of instruction, there may be from zero up to three more bytes that need to be fetched in order to complete fetching the instruction. Fetch unit 5 fetches these bytes and places them into the portions of instruction buffer 6 labeled P1, P2 and P3. The number of subsequent bytes depends on the opcode.
In the present example, the first byte of the first instruction indicates that the instruction is the ADD instruction (add instruction where the carry-in is ignored). The next byte (37) fetched by fetch unit 5 is therefore loaded into P1. When the second byte is present in P1, then fetch unit 5 sends a signal VALID to execution unit 7 indicating that the complete instruction is present in instruction buffer 6. Execution unit 7 supplies operands as indicated by the opcode to ALU 8. In the present example of an ADD instruction, the first nibble of the second byte is a “3”. This indicates that the content of register 3 (R3) in data memory 3 is to be read out of data memory 3 by execution unit 7 and is to be supplied as DOUT[0:7] to ALU 8. The “D” in DOUT indicates the “destination” operand. The second nibble of the second byte is a “7”. This indicates that the content of register 7 (R7) in data memory 3 is to be read out of data memory 3 by execution unit 7 and is to be supplied as SOUT[0:7] to ALU 8. The “S” in SOUT indicates the “source” operand. ALU 8 outputs the sum of the two operand values onto the 8-bit output of ALU 8 as ALUOUT[0:7]. This output is then written back into the destination operand's address DIN in data memory 3. The content of R3 is therefore overwritten with the result of the 8-bit addition. A carry bit value remains in ALU 8.
In similar fashion, fetch unit 5 fetches the next byte out of the program memory and places it into the opcode portion 10 of instruction buffer 6. As seen from FIG. 2, the instruction is an ADC instruction (add with carry). Fetch unit 5 therefore reads the next byte (26) out of program memory and places that byte into P1. Because the first nibble of the second byte is a “2”, execution unit 7 reads the content of R2 out of data memory 3 and supplies that value to ALU 8 as the first operand on DIN[0:7]. Because the second nibble of the second byte is a “6”, execution unit 7 reads the content of R6 out of data memory 3 and supplies that value to ALU 8 as the second operand on SOUT[0:7]. The instruction is an ADD with carry (ADC), so the carry bit value from the previous instruction that was stored in the ALU 8 is used as a carry in value for the second instruction. ALU 8 outputs the resulting 8-bit sum onto ALUOUT[0:7]. The carry bit out is stored in ALU 8 as before. The 8-bit sum is written back into the destination operand's address DIN in data memory 3. The content of R2 is therefore overwritten.
In similar fashion, the third ADC (add with carry) instruction and the fourth ADC (add with carry instruction) instruction are performed. At the end of execution of the fourth instruction, the 32-bit result of the 32-bit add operation is present in registers R3, R2, R1 and R0. The first 32-bit value added is the 32-bit value in registers R7, R6, R5 and R4. The second 32-bit value added is the 32-bit value that was originally present in registers R3, R2, R1 and R0. The carry out value for the 32-bit add operation is present in ALU 8. This carry out value is also made available to the execution unit 7 in a status register (not shown).
In the 32-bit add example of FIG. 1, notice that eight bytes of program memory are consumed to store the instructions necessary to perform the 32-bit operation using the 8-bit ALU. In processors where execution speed on such larger 32-bit values is more important and reducing the cost of the processor is less important, wider ALUs and wider data buses are typically employed. Where processing speed of such 32-bit values is less important but where reducing the cost of the processor is more important, then typically narrower ALUs and a narrower data buses are employed. When a narrower ALU and data bus are employed, however, a large amount of program memory is consumed due to the need to store multiple 8-bit instructions in order to perform a single larger 32-bit instruction. Where cost of the overall microcontroller implementation is critical, it would be desirable to be able to reduce the amount of program memory consumed without providing a wider ALU and data bus. If the amount of program memory required to perform a given program could be reduced, then either more program functionality could be provided into a microcontroller design having the same amount of program memory (thereby increasing functionality of the implementation) or the amount of program memory provided could be reduced (thereby reducing the cost of the implementation).
In addition to the above considerations, certain instruction sets and processor architectures have been well accepted over the years. It is desired to be able to use the experience and familiarity with these instruction sets and processors in the designs of subsequent products. A large body of software has been written for such well-accepted processors, and it is desired to maintain as much backward compatibility with the previous software and instruction sets as possible. It is not, therefore, desirable to generate a new instruction set from scratch. A solution is therefore desired that reduces the amount of program memory necessary to perform large arithmetic and logic operations in a processor having a narrower ALU and data bus, and it is desired to do this in a way that maintains a substantial degree of backward compatibility with an existing well-accepted instruction set and processor design.