The microprocessor based on the inventive architecture is designed such that the microprocessor has a power-saving fetch and decoding unit for fetching and decoding compressed program instructions. The fetch and decoding unit has a program instruction memory which receives a sequential program instruction address addressing the next program instruction memory line which is to be read, having at least one program instruction memory line which can store an indicator flag, a long program instruction index, a short program instruction and a first source register address. In addition, the fetch and decoding unit has a directory memory for long program instructions which receives the long program instruction index addressing the next directory memory line which is to be read, having at least one directory memory line which can store a long program instruction and a second source register address. The fetch and decoding unit also has a short program instruction decoding unit for decoding the short program instruction which has been read from the program instruction memory and for providing a first program instruction counter, and a long program instruction decoding unit for decoding the long program instruction which has been read from the directory memory and for providing a second program instruction counter. In addition, the microprocessor has a program instruction sequencer which generates the sequential program instruction address on the basis of the first program instruction counter and the second program instruction counter.
An ordinary microprocessor program is the cause of most access operations to the program instruction memory. The frequent memory access means that the program instruction fetch unit in the microprocessor is the largest consumer of power within the microprocessor. The power consumption of the program instruction fetch unit may be up to one third of the total power consumption of the microprocessor. Reducing the program instruction memory also means reducing the total power consumption of a microprocessor. Besides the integration stage of a microprocessor, its power consumption is a fundamental aspect in the development of a microprocessor architecture.
FIG. 1 shows a conventional program instruction memory PBS based on the prior art. The program instruction memory PBS has at least one program instruction memory line PBSZ. In the example shown in FIG. 1, the program instruction memory PBS has seven program instruction memory lines PBSZ which are each addressed by a sequential program instruction address PBA (PBA-1 to PBA-7), with each program instruction memory line PBSZ storing a program instruction PB (PB-1, PB-2, PB-3) in an orderly sequence with respect to the loaded microprocessor program.
A program instruction memory line PBSZ has v bits, for example v=32 bits, v=128 bits etc.
The program instruction sequence of the microprocessor program stored in the program instruction memory PBS is as follows:
1st PBSZPB12nd PBSZPB23rd PBSZPB24th PBSZPB35th PBSZPB16th PBSZPB37th PBSZPB2
The stored microprocessor program comprises a sequence of seven program instructions, with only three different program instructions (PB1, PB2, PB3) arising within the microprocessor program.
FIG. 2 shows, on the basis of the prior art, a possible way of reducing the memory space requirement of a microprocessor program which does not exclusively comprise different program instructions (and accordingly has a certain level of redundancy for program instructions which arise). This is done by using data compression of the microprocessor program, for example on the basis of the Huffman code, known from source coding.
For every program instruction (PB-1, PB-2, PB-3) which arises in the microprocessor program, a program instruction index PBI is generated, the program instruction index PBI being a compressed code word associated with the program instruction PB.
The program instruction index PBI has w bits, with w being significantly less than v (w<<v).
Instead of the program instruction memory PBS from FIG. 1, FIG. 2 provides a program instruction index memory PBIS and a directory memory VS.
The directory memory VS is used to store the program instructions which arise in the microprocessor program in a respective directory memory line VSZ. Accordingly, the directory memory VS in the example from FIG. 2 has three directory memory lines VSZ.
The program instruction index memory PBIS has stored the orderly sequence of the microprocessor program (cf. FIG. 1) in orderly fashion using the program instruction indices PBI.
The sequential program instruction addresses PBA from FIG. 1 and FIG. 2 correspond to one another.
If the sequential program instruction address PBA-1 points to the first program instruction index memory line PBISZ, the program instruction index PBI-1 is read, which points to the program instruction PB-1 in the directory memory VS. The program instruction index PBI-1 is subsequently read from the program instruction index memory PBIS using the fifth sequential program instruction address PBA-5. The program instruction index PBI-1 which has been read from the fifth program instruction index memory line PBIS also points to the first directory memory line VSZ, which stores the program instruction PB-1.
Storing the microprocessor program shown in FIG. 1 requires 7 (number of program instruction lines) times v bits (word length of the program instruction). If the word length of the program instruction is v=32 bits, for example, then storing the microprocessor program shown in FIG. 1 requires: 7*32 bits=224 bits.
If the same microprocessor program shown in FIG. 2 is stored using the program instruction index memory BPIS and the directory memory VS, then 7 (number of program instruction lines) times w bits (let w=4 bits, for example) are required in order to store the program instruction indices PBI in the program instruction index memory PBIS. To store the three different program instructions PB1, PB2, PB3 in the directory memory VS, 3 (number of different program instructions PB) times v (word length of the program instruction) are required. Storage of the microprocessor program requires a total of: 7*4 bits+3*32 bits=124 bits.
By compressing the program instructions into program instruction indices PBI and by splitting the program instruction memory PBS into a directory memory VS and a program index memory PBIS, 100 bits of memory space requirement is saved in the example shown above.
FIG. 3 shows a known microprocessor based on a pipeline architecture.
The splitting of the program instruction memory PBS into a directory memory VS and a program instruction index memory PBIS as proposed in FIG. 2 is used in the known microprocessor MP shown in FIG. 3.
The microprocessor MP has a fetch and decoding unit HDE for fetching and decoding program instructions.
The fetch and decoding unit HDE has a program fetch unit PHE and a program instruction decoding unit PDE.
The program instruction fetch unit PHE has a program instruction index memory PBIS and a directory memory VS (functionality shown in FIG. 2).
As FIG. 2 shows, the program instruction index memory PBIS in the program instruction fetch unit PHE receives from the standard processor root unit SPRE in the microprocessor MP the next sequential program instruction address PBA which is to be read.
The received sequential program instruction address PBA addresses a program instruction index memory line PBISZ which stores the program instruction index PBI which is to be read.
The program instruction index PBI which has been read is transferred to the directory memory VS.
The directory memory VS reads the program instruction PB addressed using the received program instruction index PBI and transfers the program instruction PB to the program instruction decoding unit PDE in the fetch and decoding unit HDE in the microprocessor MP. Optionally, a first source register address 1st QRA and a second source register address 2nd QRA are stored in the directory memory VS in association with the program instruction PB in addition to the program instruction PB.
The additional first source register address 1st QRA associated with the program instruction and the additional associated second source register address 2nd QRA are transferred to the register bank RB which are provided in the microprocessor MP.
In the same clock cycle, the program instruction decoding unit PDE decodes the received program instruction PB, and the register bank RB provides the first register value 1st RW, addressed using the first source register address 1st QRA, and the second register value 2nd RW, addressed using the second source register address 2nd QRA, and transfers them to the standard processor root unit SPRE on a clock-cycle-sensitive basis. Similarly, the program instruction decoding unit PDE transfers the decoded program instruction PB to the standard processor root unit SPRE.
The standard processor root unit SPRE has an operand fetch unit OHE for fetching operands in the received decoded program instruction PB, a program instruction execution unit PAE for executing the received decoded program instruction PB, and a write-back unit ZSE for writing back operation results or write-back values ZSW.
The write-back unit ZSE writes the write-back values ZSW to the register bank RB.
The program instruction executed by the program instruction execution unit PAE results in the generation of a sequential program instruction address PBA for the next program instruction PB which is to be read, and this sequential program instruction address PBA is transferred to the program instruction index memory PBIS for the purpose of reading the next program instruction index PBI.
FIG. 4 shows a pipeline diagram of the known microprocessor based on the pipeline architecture.
The known microprocessor based on the pipeline architecture processes the program instructions PB according to the following sequence: the program instruction PB-1 is processed by the microprocessor MP by reading the program instruction index PBI, which is associated with the program instruction PB-1, from the program instruction index memory PBIS in clock cycle T1.
In clock cycle T2, the directory memory VS receives the program instruction index PBI associated with the program instruction PB-1 and provides the program instruction PB-1 in accordance with FIG. 2.
In clock cycle T3, the program instruction PB-1 is received by the program instruction decoding unit PDE and the received program instruction PB-1 is decoded. In addition, the decoded program instruction PB-1 is provided.
In clock cycle T4, the decoded program instruction PB-1 is received by the standard processor root unit SPRE and is processed further in accordance with the prior art.
Following the pipeline diagram shown in FIG. 4, the program instruction PB-2 is in the instruction decoding unit PDE during clock cycle T4, the program instruction PB-3 is in the directory memory VS and the program instruction PB-4 is in the program instruction index memory PBIS.
FIGS. 2 to 4 show a known microprocessor architecture which takes advantage of the data compression of a microprocessor program.
For every program instruction, there is an index within this architecture. If a program instruction occurs in a microprocessor program only rarely or even just once, then this architecture is not suitable, since the higher level of complexity of the architecture in FIGS. 2 to 4 over an architecture without data compression on account of the added pipeline stage in the directory memory VS would not be justified.
A further drawback of the prior art is that particularly skip instructions (branches) cause high latencies. If the program instruction PB-1 is a skip instruction of this type, for example, then the three instructions PB-2, PB-3 and PB-4 have been loaded into the pipeline of the microprocessor MP. The pipeline of the microprocessor MP needs to be cleared of the program instructions PB-2, PB-3 and PB-4.
In accordance with the microprocessor architecture based on the prior art shown above, a skip instruction causes a latency of three clock cycles for the pipeline of the microprocessor MP. In the case of microprocessor programs with low redundancy, an additional pipeline stage is not justified, however, as in the prior art shown above.
FIG. 5 shows a known program instruction index memory PBIS for the extended storage of program instruction indices PBI and short program instructions K.
FIG. 5 shows a further development of the prior art shown in FIGS. 2 to 4, since besides the program instruction indices PBI a program instruction index memory line PBISZ in the program instruction index memory PBIS can also store short instructions K.
It is appropriate for short instructions K which arise only rarely in the microprocessor program not to be indexed using a program instruction index PBI.
FIG. 5 shows the known program instruction index storage with extended storage of program instruction indices PBI and short program instructions K, the program instruction index memory PBIS having at least one program instruction index memory line PBISZ. A program instruction index memory line PBISZ has x bits.
One bit of the x bits in the program instruction index memory line PBISZ is used to store an indicator flag AF.
The indicator flag AF has the function of specifying whether the corresponding program instruction index memory line PBISZ stores a short instruction K or a program instruction index PBI.
In accordance with the present example shown in FIG. 5, a program instruction index memory line PBISZ stores a short instruction K if the indicator flag AF has been set to zero.
By contrast, a program instruction index memory line PBISZ stores a program instruction index PBI if the indicator flag AF has been set to one.
The remaining x-1 bits of a program instruction index memory line PBISZ are used to store a program instruction index PBI or to store a short instruction K. This produces a definition of a short instruction K. A short instruction K is a program instruction PB which has no more than a number of x-1 bits.
Accordingly, all program instructions which have at least x bits are “long program instructions” L and accordingly need to be indexed using a program instruction index PBI.
FIG. 6 shows a known microprocessor MP which uses the known program instruction index memory PBIS for the extended storage of program instruction indices PBI and short program instructions K.
The microprocessor MP shown in FIG. 6 has a fetch and decoding unit HDE.
The fetch and decoding unit HDE has the program instruction index memory PBIS, the directory memory VS, a first delay element 1st VG, a second delay element 2nd VG, a first multiplexer 1st MUX, a second multiplexer 2nd MUX and a program instruction decoding unit PDE.
The microprocessor MP also has a standard processor root unit SPRE which provides a sequential program instruction address PBA for the next program instruction which is to be read in the fetch and decoding unit HDE.
The program instruction index memory PBIS in the fetch and decoding unit HDE receives the sequential program instruction address PBA of the next program instruction PB which is to be read.
As FIG. 5 shows, that program instruction index memory line PBISZ in the program instruction index memory PBIS which corresponds to the sequential program instruction address PBA is read.
If the indicator flag AF in the program instruction index memory line PBISZ which is to be read has been set to one, then the program instruction index memory line PBISZ which is to be read stores a program instruction index which addresses a long program instruction L which is stored in the directory memory VS. The program instruction index PBI which has been read is transferred to the directory memory VS (cf. FIG. 5).
The directory memory VS ascertains the long instruction L corresponding to the program instruction index PBI and transfers it to the first multiplexer 1st MUX and to the second multiplexer 2nd MUX.
If the indicator flag AF in the corresponding program instruction index memory line PBISZ which is to be read has been set to zero, however, the corresponding program instruction index memory line PBISZ stores a short instruction K which is transferred from the program instruction index memory PBIS to the first delay element 1st VG.
The first delay element 1st VG delays the short program instruction K received by one clock cycle T and transfers the delayed short program instruction K to the first multiplexer 1st MUX and to the second multiplexer 2nd MUX.
The indicator flag AF in the respective program instruction index memory line which is to be read in the program index memory PBS is transferred to the second delay element 2nd VG.
The second delay element 2nd VG delays the received indicator flag AF by one clock cycle T and uses the delayed indicator flag AF to control the multiplexers 1st MUX and 2nd MUX.
The first multiplexer 1st MUX has the function of respectively providing the received program instruction, either the short program instruction K or the long program instruction L, to the program instruction decoding unit PDE on a clock-cycle-sensitive basis.
During the same clock cycle, the second multiplexer 2nd MUX has the function of transferring the first source register address 1st QRA and the second source register address 2nd QRA which are associated with the respective received program instruction, either the short program instruction K or the long program instruction L, to the register bank RB.
Hence, in the same clock cycle, the instruction decoding unit PDE decodes the respective program instruction, either the short program instruction K or the long program instruction L, and the source register addresses associated with the respective program instruction, the first source register address 1st QRA and the second source register address 2nd QRA, are received by the register bank RB, with the first source register address 1st QRA addressing a first register value and the second source register address 2nd QRA addressing a second register value 2nd RW.
On a clock-cycle-sensitive basis, the program instruction decoding unit PDE provides either the decoded short program instruction K or the decoded long program instruction L for the standard processor root unit.
During the same clock cycle, a forwarding device WE which is provided within the microprocessor MP supplies the standard processor root unit SPRE with the associated register values corresponding to the program instruction which is to be transferred to the standard processor unit SPRE from the program instruction decoding unit PDE, the first register value 1st RW and the second register value 2nd RW.
The forwarding unit WE makes the decision about the register values which are to be forwarded on the basis of received register value requests, namely a first register value request 1st RWA, received from the operand fetch unit OHE, a second register value request 2nd RWA, received from the program instruction execution unit PAE, and on the basis of the received register values, the first register value 1st RW and the second register value 2nd RW.
On the basis of the prior art, the standard processor root unit SPRE processes the received program instructions and the associated first register values and second register values 1st RW and 2nd RW using the operand fetch unit OHE, the program instruction execution unit PAE and the write-back unit ZSE.
The standard processor root unit SPRE is prompted by the processed program instructions to generate a sequential program instruction address PBA for the next program instruction which is to be read and provides the sequential program instruction address PBA for the fetch and decoding unit HDE.
A drawback of this prior art is that only short instructions do not need to be indexed without accessing a source register address. Even with short program instructions K, which require a small amount of memory within the program instruction index register PBI and access a source register address, the first source register address 1st QRA, two memory access operations are needed, to the program instruction index memory PBI and to the directory memory VS. Skip instructions cause long latencies.
The aforementioned problems result in a higher level of power consumption by the microprocessor and in a large time involvement on account of the memory access operations which have not been reduced.
In addition, it is not out of the ordinary for a program instruction index memory line PBISZ to be not completely used for storing a program instruction index PBI or a short program instruction K, i.e. for a particular number of the x bits in the program instruction index memory line to remain unused. These unutilized resources are a drawback of the prior art.