1. Field of Invention
The present invention is related to the combined compilation and verification of platform neutral bytecode computer instructions, such as JAVA. More specifically, the present invention relates to a new method of creating optimized machine code from platform neutral bytecode on either the development or target system by concurrently performing bytecode verification and compilation.
2. Description of Related Art
The benefit of architecture neutral language such as JAVA is the ability to execute such language on a wide range of systems once a suitable implementation technique, such as a JAVA Virtual Machine, is present. The key feature of the JAVA language is the creation and use of platform neutral bytecode instructions, which create the ability to run JAVA programs, such as applets, applications or servelets, on a broad range of diverse platforms. Typically, a JAVA program is compiled through the use of a JAVA Virtual Machine (JVM) which is merely an abstract computing machine used to compile the JAVA program (or source code) into platform neutral JAVA bytecode instructions, which are then placed into class files. The JAVA bytecode instructions in turn, serve as JVM instructions wherever the JVM is located. As bytecode instructions, the JAVA program may now be transferred to and executed by any system with a compatible JAVA platform. In addition, any other language which may be expressed in bytecode instructions, may be used with the JVM.
Broadly speaking, computer instructions often are incompatible with other computer platforms. Attempts to improve compatibility include “high level” language software which is not executable without compilation into a machine specific code. As taught by U.S. Pat. No. 5,590,331, issued Dec. 31, 1996 to Lewis et al., several methods of compilation exist for this purpose. For instance, a pre-execution compilation approach may be used to convert “high level” language into machine specific code prior to execution. On the other hand, a runtime compilation approach may be used to convert instructions and immediately send the machine specific code to the processor for execution. A JAVA program requires a compilation step to create bytecode instructions, which are placed into class files. A class file contains streams of 8-bit bytes either alone or combined into larger values, which contain information about interfaces, fields or methods, the constant pool and the magic constant. Placed into class files, bytecode is an intermediate code, which is independent of the platform on which it is later executed. A single line of bytecode contains a one-byte opcode and either zero or additional bytes of operand information. Bytecode instructions may be used to control stacks, the VM register arrays or transfers. A JAVA interpreter is then used to execute the compiled bytecode instructions on the platform.
The compilation step is accomplished with multiple passes through the bytecode instructions, where during each pass, a loop process is employed in which a method loops repeatedly through all the bytecode instructions. A single bytecode instruction is analyzed during each single loop through the program and after each loop, the next loop through the bytecode instructions analyzes the next single bytecode instruction. This is repeated until the last bytecode instruction is reached and the loop is ended.
During the first compilation pass, a method loops repeatedly through all the bytecode instructions and a single bytecode instruction is analyzed during each single loop through the program. If it is determined the bytecode instruction being analyzed is the last bytecode instruction, the loop is ended. If the bytecode instruction being analyzed is not the last bytecode instruction, the method determines stack status from the bytecode instruction and stores this in stack status storage, which is updated for each bytecode instruction. This is repeated until the last bytecode instruction is reached and the loop is ended.
During the second compilation pass, a method loops repeatedly through all the bytecode instructions once again and a single bytecode instruction is analyzed during each single loop through the program. If it is determined the bytecode instruction being analyzed is the last bytecode instruction, the loop is ended. If the bytecode instruction being analyzed is not the last bytecode instruction, the stack status storage and bytecode instruction are used to translate the bytecode instruction into machine code. This is repeated until the last bytecode instruction is translated and the loop is ended.
A JAVA program however, also requires a verification step to ensure malicious or corrupting code is not present. As with most programming languages, security concerns are addressed through verification of the source code. JAVA applications ensure security through a bytecode verification process which ensures the JAVA code is valid, does not overflow or underflow stacks, and does not improperly use registers or illegally convert data types. The verification process traditionally consists of two parts achieved in four passes. First, verification performs internal checks during the first three passes, which are concerned solely with the bytecode instructions. The first pass checks to ensure the proper format is present, such as bytecode length. The second pass checks subclasses, superclasses and the constant pool for proper format. The third pass actually verifies the bytecode instructions. The fourth pass performs runtime checks, which confirm the compatibility of the bytecode instructions.
As stated, verification is a security process, which is accomplished through several passes. The third pass in which actual verification occurs, employs a loop process similar to the compilation step in which a method loops repeatedly through all the bytecode instructions and a single bytecode instruction is analyzed during each single loop through the program. After each loop, the next loop through the bytecode instructions analyzes the next single bytecode instruction which is repeated until the last bytecode instruction is reached and the loop is ended.
During the verification pass, the method loops repeatedly through all the bytecode instructions and a single bytecode instruction is analyzed during each single loop through the program. If it is determined the bytecode instruction being analyzed is the last bytecode instruction, the loop is ended. If the bytecode instruction is not the last bytecode instruction, the position of the bytecode instruction being analyzed is determined. If the bytecode instruction is at the beginning of a piece of code that is executed contiguously (a basic block), the global stack status is read from bytecode auxiliary data and stored. After storage, it is verified that the stored global stack status is compliant with the bytecode instruction. If however, the location of the bytecode instruction being analyzed is not at the beginning of a basic block, the global stack status is not read but is verified to ensure the global stack status is compliant with the bytecode instruction. After verifying that the global stack status is compliant with the bytecode instruction, the global stack status is changed according to the bytecode instruction. This procedure is repeated during each loop until the last bytecode instruction is analyzed and the loop ended.
It may be noted that the pass through the bytecode instructions that is required for verification closely resembles the first compilation pass. Duplicate passes during execution can only contribute to the poor speed of JAVA programs, which in some cases may be up to 20 times slower than other programming languages such as C. The poor speed of JAVA programming is primarily the result of verification. In the past, attempts to improve speed have included compilation during idle times and pre-verification. In U.S. Pat. No. 5,970,249 issued Oct. 19, 1999 to Holzle et al., a method is taught in which program compilation is completed during identified computer idle times. And in U.S. Pat. No. 5,999,731 issued Dec. 7, 1999 to Yellin et al. the program is pre-verified, allowing program execution without certain determinations such as stack overflow or underflow checks or data type checks. Both are attempts to improve execution speed by manipulation of the compilation and verification steps. In order to further improve speed, a method and apparatus is needed that can combine these separate, yet similar steps, the verification pass, and the first and second compilation pass, into a step which accomplishes the multiple tasks in substantially less time.