The present invention relates to a method of verifying consistency of codes for an embedded system.
The invention relates more particularly, but not exclusively, to the field of applications in interpreted language of the bytecode (pseudo-code) type that are loaded in a smart card.
The term “embedded system” is used below in the broad sense, in particular to designate a system designed for any portable electronic device, e.g. a smart card (chip card) whose processing and storage resources are relatively limited.
Similarly, an “interpreted language” is a non-compiled language in which execution of the lines of code requires the presence of auxiliary means making it possible to interpret the code. An example of such a language is the Java (Registered Trademark) language that is in very widespread use in application solutions for smart cards. The Java application or “applet” is interpreted by an associated Java Virtual Machine (JVM). Hardware solutions also exist, e.g. a dedicated chip, that implement the equivalent of the virtual machine. The term “virtual machine” is used below to designate both auxiliary means of the software type and also auxiliary means of the hardware type that make it possible to interpret an associated interpreted language.
Verification of pseudo-code (bytecode), for example and non-exclusively Java (Registered Trademark), is a key element in the security of Java (Registered Trademark) platforms. Such verification consists, in particular, in ensuring that a bytecode program is unadulterated (integrity verification) and that it complies with properties, e.g. the typing of the variables of the code, said bytecode program being interpreted by a virtual machine, i.e. by a machine having a stack (memory with stacking and unstacking access) and having registers (memory registers with indexed access). These verification operations are relatively complex and resource-consuming (high consumption of Random Access Memory (RAM) and of processing time).
With the development of smart cards, Java (Registered Trademark) solutions have been integrated into such smart cards. During the life of the smart card, new applications, e.g. Java (Registered Trademark) applets, are loaded into the card in order to be used. Such applets can be corrupted or adulterated and can make calls to unauthorized memory zones, thereby generating malfunctions on the virtual machine. With the appearance of smart cards and with the integration of programs into such cards, such verification has become extremely complicated in all embedded systems, in view of the lack of available resources.
It is frequent for programs of the bytecode type to implement calls to other programs or to subprograms. A distinction can be made between calls to programs sharing the same execution context as the calling program and calls to programs of other methods having a specific dedicated execution context. The invention concerns more particularly calls to programs or subprograms that have the same execution context as the calling program. The term “subprogram” is used below to define the portions of code that can be reached from other portions of code sharing the same execution context, regardless of whether the portions are called programs or called subprograms (set of lines of code in common with the calling program). Such calls can be implemented in functions of the “Goto” or “If” types, or during calls to macros.
It should be noted, by way of example, in Java (Registered Trademark) language, that a pair instructions exist, namely Jump to Subroutine (JSR) and Return from Subroutine (RET), that implement subroutines or subprograms. FIG. 1 proposes an example of a code having a subprogram (B7 to RET) with a call to said subprogram (line 4: JSR B7). When, at the end of the subprogram, a RET instruction is executed, the virtual machine executes the bytecode following the JSR that called the subprogram. In order to store the information of the calling JSR, its address is recorded on the stack of the virtual machine, but without any instance of typing of the information: it is a numerical value in the stack that depends on the execution flow. The problem of such recording lies in the fact that the standard verifiers work on the basis of the typings and do not have access to the numerical values proper. It is thus not possible to determine statically which code portions are calling the subprogram.
Such verification algorithms apply the unification algorithm for each bytecode, the principle of which algorithm is as follows: in a bytecode, at a point of convergence at which the same variable converges with two different typings (coming from two different jumps to subprograms, for example), the variable takes the typing of the first ancestor common to the two typings (the concept of common ancestor results from the principles of inheritance of the object-oriented language of the Java (Registered Trademark) type). And in the event of typing incompatibility, a type called “TOP” is assigned to the variable. Then, during modeling of the bytecode, if the typing expected by the bytecode is not compatible with the bytecode received, the code is rejected.
With subprograms, two different calls to the same subprogram can be implemented even though a variable does not have the same typing. Thus, it is possible for a verification error (incompatible typings) to occur even though there is no typing problem (since there are two different contexts, the two typings cannot interfere during execution of the code by the virtual machine).
For Java (Registered Trademark) cards, the pseudo-code verification ensures that no illegal manipulation is performed on the typing of the elements used by the bytecode. Two properties are to be verified:                for each bytecode, the height of the stack is always the same regardless of the execution path;        for each bytecode, there exists a typing of the variables (registers) and of the stack stages that is compatible with the bytecode regardless of the execution path.        
For this purpose, all of the possible execution paths are explored statically. This is an abstract execution of the bytecode.
For each line of bytecode, the integrity verification requires a lot of information to be stored. It has been shown that it suffices to effect this storage only for the targets of jumps. In addition, the algorithm needs to store additional information such as the instruction pointer or “program counter” (pointer on the line of code at the current verification point), the worklist (list of lines of codes to be verified subsequently) and the current frame (set of typings of the registers and of the stack at the point that is being examined, recorded in the RAM of the device).
External verification solutions are known—such as the SUN MICROSYSTEMS (Registered Trademark) solution—in which the bytecode is initially verified during off-card processing. Once it has been validated, it is loaded onto the smart card. The drawback with such solutions lies in the fact that, between the verification of the bytecode and the loading into the card, a possibility of adulterating the code exists. Those solutions thus do not guarantee integrity between the initial code and the final code loaded onto the card and then executed.
The SUN MICROSYSTEMS (Registered Trademark) verifier is also known, in which the verification is performed off-card in a secure environment and which makes it possible to sign the program. The card merely has to verify the signature on receiving the program.
Having that solution carried by the card suffers from drawbacks, in particular RAM consumption that is too high.
Verification with a proof carrying code is also known. A proof carrying code is computed off-card, and is then added to the program when the program is transmitted to the card. The idea is to insert typing information into the code. As a result, verification on the card is greatly facilitated and requires only a very small amount of RAM.
The drawback with that solution lies in the need for off-card pre-processing: computing the proof; and in the larger size of the data (bytecode and proofs) to be transmitted and stored: longer transmission time, and increased consumption of passband.
The Trusted Logic (Registered Trademark) verifier is also known that is protected by Patent FR 2 815 434. The registers used by the virtual machine are split up monomorphically, i.e. each register has a single variable typing. The RAM needs are thus reduced. The drawback of that solution is that it is necessary to perform computation off-card in order to modify the methods so that they verify the two additional properties required.
The literature tends to indicate that certain embedded bytecode verifications are infeasible. In particular, the publication “Java bytecode verification: algorithms and formalisations” (http://pauillac.inria.fr/˜xleroy/publi/bytecode-verification-JAR.pdf) specifies that polyvariant conventional verification algorithms cannot be implemented on equipment having low processing capacities such as Java (Registered Trademark) cards.