Data processing systems including a processor, a data memory and a program memory have for the most part an architecture complying with one of two main architecture models.
The first architecture model is known as “Harvard”. The processing systems complying with this architecture generally have a first bus for data exchange between the processor and a program memory and a second bus, different from the first one, for the exchange of data between the processor and a data memory.
The second architecture model, with which the data processing system of the invention complies, is known as “von Neumann”. In accordance with this architecture, a common bus is shared by the data memory and the program memory for the exchange of information with the processor. This means that program (code) instructions of the program memory and processing data of the data memory cannot be conveyed concomitantly by the common bus.
Whatever the architecture adopted for the data processing system, the performance of the system, in terms of processing speed, can generally be improved by increasing the operating frequency of the processor.
The data processing speed is, however, not limited solely by the operating frequency of the processor but also by the time necessary for reading the instructions and the data necessary for the processing in the program or data memories. The time necessary for reading or writing a data item or an instruction in a memory can be reduced by increasing the electrical voltage controlling the memories.
The increase in the operating frequency of the processor and the voltage of the memories results in an increased electric power consumption and a higher heat dissipation. These effects, possibly acceptable for fixed installations, are particularly harmful for embedded applications and in particular for portable equipment supplied with power by battery or accumulator.
In order to reduce slowness in processing related to the memory reading (or writing) time, also known as access time, particular types of memory have been designed. These memories are referred to in the remainder of the text as rapid-access memories. These are, for example, memories of the “burst” type and memories of the “page” type. With these memories, when a series of data or instructions is to be read, the reading of the first data item or instruction in the series takes place according to a slow-access mode, referred to as “initial access”, and the following data or instructions are read in rapid-access mode with reduced reading time.
By way of illustration, for a memory of the burst type, the initial access, which is not sequential, requires an access time of around 65 nsec, whereas the subsequent rapid accesses, which are sequential, require only an individual access time of 18 nsec.
In an architecture of the “von Neumann” type as mentioned above, the processor reads (or possibly records) data or instructions selectively or in alternation in the one of the data or program memories, and then in the other one of these memories. In the remainder of the text, the state of the memory in communication with the processor is referred to as “active”. Its state is termed “inactive” when the memory is not in communication with the processor. Each change of a memory from the so-called inactive state to the so-called active state results in a first slow access: this is the initial access. As indicated above, the subsequent data read in the same memory before it returns to the inactive state are obtained in rapid access mode. Subsequent data means data which are stored following a first data item in a burst memory or which are stored in the same page of a page memory.
An illustration is given below of the functioning of a particular processor in a “von Neumann” architecture. This is a reduced instruction set processor (RISC) of the ARM7 type. This processor is capable of performing a certain number of tasks, among which there are in particular:                data movement,        flow control in the execution of a program),        arithmetic operations (addition, subtraction),        logic operations (AND, OR, NAND, NOR).        
The tasks are executed mainly in three steps corresponding to three execution levels of the processor (pipe-line). These three steps are the reading of an instruction, the decoding thereof, and the actual execution thereof. The steps can be accompanied by data reading or writing.
Table I below summarizes these steps in a series of tasks to be performed, given purely by way of example. In this table, a succession of tasks are indicated, and the steps of execution thereof. The steps are designated “F” (fetch) for the reading of an instruction in the program memory, “D” for the decoding of an instruction, “E” for its execution, “A” designates the reading of data in the data memory. The boxes in the table marked with an “X” correspond to a wait related to the reading of a data item in the data memory.
TABLE 1TasksSteps1DataFDEAmovement2LogicFDXEoperation3DataFXDEAmovement4ArithmeticFDXEoperation5FlowFXDEcontrol6DataFDXXmovement7ArithmeticFXXoperation8DataFDEmovement9ArithmeticFDoperation10DataFmovementNSSSNSNSSNSNSSNSNSS
Table I, whose chronological reading goes from left to right, also indicates, in its last line, the sequential (S) or non-sequential (NS) character of the steps, in the case where the memories are of the burst type.
As indicated above, the “F” boxes in Table I correspond to a reading in the program memory while the “A” boxes correspond to a reading in the data memory. Thus, during each reading “A” in the data memory, the program memory goes into the so-called inactive state so that the following step “F” is performed after an initial non-sequential access of longer duration (65 nsec). The next reading “A” in the data memory is also performed after an initial non-sequential access since a reading “F” in the program memory has occurred in the meantime. This appears in particular on lines 1 to 5 of Table I. Finally, each reading of a data item in the data memory results in two non-sequential initial accesses and in the loss of one processing cycle. In Table I each processing cycle corresponds to one step, that is to say one box in the direction of the rows.
When a flow command is executed, which corresponds to the last box on line 5 in the table, the reading of the following data in the program memory takes place at an address which does not follow the addresses of the instructions previously read in this same memory. This therefore entails a non-sequential access of the initial type to the program memory. In addition, the decoding and the execution of the program instructions still read following the addresses of the previous instructions, before the execution of the flow command, must be inhibited or at the very least are unnecessary, since they do not take the flow command into account. These steps are also marked with an “X” in lines 6 and 7 of the table.
Finally, it may be noted that, for memories of the burst type, there is a maximum length of words able to be read successively with rapid access (18 nsec). At the end of this number, which is for example 32, a new slower initial access (65 nsec) must be effected.
It is possible to calculate the average time necessary for executing a program involving 100 “F” steps, that is to say the reading of 100 instructions. This calculation is based on the functioning indicated by Table I and on the data in Table II. Table II indicates the statistical proportion of the various tasks mentioned above in the execution of a program.
TABLE IIData movement43%Flow control23%Arithmetic operation15%Comparison13%Logic operations 5%Others 1%
Considering the above data, the reading of 100 program instructions requires 143 cycles, that is to say 143 execution steps. Among these, there are 43 non-sequential (initial) accesses for the data movement of the data memory, 43 subsequent non-sequential (initial) cycles for the data movement of the program memory, 23 non-sequential (initial) cycles following flows, and 34 sequential (rapid) cycles for other instructions.
Considering also that the frequency of the processor is sufficient not to slow down the access time to the memory, and that the access times are respectively 65 nsec for the initial non-sequential accesses and 12 nsec for the rapid sequential accesses, the total duration of the execution of the 100 instructions is 7697 nsec.
This calculation corresponds to the use of a memory of the burst type. By way of comparison, by replacing the burst memory with a conventional memory, that is to say a memory for which all the accesses would be slow (65 nsec), the same operations would require a total period of 9295 nsec.
Finally, the increase in the performance of a data processing system with a “von Neumann” architecture is only 17% by replacing the traditional memories with memories of the burst type. A substantially identical finding can be made by replacing the traditional memories with memories of the page type.
To supplement the disclosure of the prior art reference can be made to documents (1), (2) and (3), whose references will be given at the end of the description. These documents concern the von Neumann architecture, and memories of the burst and page type.