1. Field of the Invention
The present invention relates to the optimization of Java class files. More particularly, the present invention relates to method for optimizing Java bytecode in the presence of try-catch blocks
2. The Prior Art
A known problem for software developers and computer users is the lack of portability of software across operating system platforms. Attempts to address this problem must include means of ensuring security as computers are linked together in ever expanding networks, such as the World Wide Web. As a response to both of these concerns, the JAVA programming language was developed at Sun Microsystems as a platform independent, object oriented computer language designed to include several layers of security protection.
Java achieves its operating system independence by being both a compiled and interpreted language. First. Java source code, which consists of Java classfiles, is compiled into a generic intermediate format called Java bytecode. Java""s bytecodes consist of a sequence of single byte opcodes, each of which identifies a particular operation to be carried out. Additionally. some of the opcodes have parameters. For example, opcode number 21, iload less than varnum greater than , takes the single-word integer value stored in the local variable, varnum, and pushes it onto a stack.
Next, the bytecodes are interpreted by a Java Virtual Machine (JVM) which translates the bytecodes into native machine code. The JVM is a stacked-based implementation of a xe2x80x9cvirtualxe2x80x9d processor that shares many characteristics with physical microprocessors. The bytecodes executed by the JVM are essentially a machine instruction set, and as will be appreciated by those of ordinary skill in the art, are similar to the assembly language of a computing machine. Accordingly, every hardware platform or operating system may have a unique implementation of the JVM, called a Java Runtime System, to route the universal bytecode calls to the underlying native system.
When developing the JAVA bytecode instruction set, the designers sought to ensure that it was simple enough for hardware optimization and also included verification means to provide security protection and to prevent the execution errors or system crashes that can result from improperly formatted bytecode. As Java""s bytecodes contain significant type information, the verification means are able to do extensive type checking when the bytecodes are first retrieved from the internet or a local disk. As a result, the interpreter of the native machine need only perform minimal type checking at run time. Unlike languages such as SmallTalk that provide protection by performing extensive runtime checks, Java executes more quickly at run time.
Although Java provides security through verification means and portability through bytecodes, Java programs lag natively compiled programs, written in languages like C/C++, in their execution time. When a user activates a Java program on a Web Page, the user must wait not only for the program to download but also to be interpreted. To improve Java""s execution time, optimizations can be introduced into the processing of Java bytecodes. These optimizations can be implemented in a variety of manners including as Stand-Alone Optimizers (SAO) or as part of Just-in-Time (JIT) compilers.
A SAO transforms an input classfile containing bytecode into an output classfile containing bytecodes that more efficiently perform the same operations. A JIT transforms an input classfile containing bytecode into an executable program. Prior to the development of JITs, a JVM would step through all the bytecode instructions in a program and mechanically perform the native code calls. With a JIT compiler, however. the JVM first makes a call to the JIT which compiles the instructions into native code that is then run directly on the native operating system. The JIT compiler permits natively compiled code to run faster and makes it so that the code only needs to be compiled once. Further, JIT compilers offer a stage at which the executable code can be optimized.
To either optimize or compile bytecodes involves the translation of the source bytecodes into what is known in the art as an intermediate representation (IR). The IR provides information about two essential components of the program: the control flow graph (CFG) and the data flow graph (DFG). Subsequently, the IR is transformed for compilers into object code and for optimizers into an improved version of the source code format.
The CFG breaks the code into blocks of bytecode, termed basic blocks, that are always performed as an uninterrupted group of instructions, and establishes the connections that link the basic blocks together. In so doing, the CFG represents different variations of the sequence in which the instructions of a program can be performed. The connections between basic blocks are known in the art as edges. The DFG maps the connections between where data values are produced and where they are used.
Instructions that can drastically alter the control flow of a program are known as exceptions. There are generally two known approaches for dealing with the problem: either a separate block of code is written that explicitly address the exception condition or one is not.
When a separate code is written. what is known in the art as a try-catch block is created. The try block includes code that identifies the presence of an exception condition. When such a condition is present. the try block passes control of the execution of the program to a catch block. This operation is known in the art as throwing an exception. The catch block is a section of code that has been written to specifically address or resolve the exception condition. One skilled in the art will be familiar that catch blocks are also known as exception handlers.
When no separate blocks of code are written, execution of the program returns to the method that called the method with the exception. The calling method may provide for the exception, i.e. address the situation in which a call to the particular method fails, or again pass control to its calling method. If no calling method addresses the exception, then execution will return out of all methods and the program will terminate. For example as is known in the art, this is the approach commonly taken with xe2x80x9cout-of-memoryxe2x80x9d exceptions.
Exceptions that do not have catch-blocks generally do not present optimization challenges. Optimization depends on an analysis of the execution of a program as embodied in the CFG to determine a more efficient means of performing the same operations. As is known in the art, this often includes eliminating repetition between earlier and later sections of code. Exceptions without catch blocks, however, do not add to the CFG of a program. Although they prompt execution to return to the calling method(s), they do not represent a different forward sequence through which the program can iterate. By not providing additional subsequent code, exceptions without catch blocks do not afford any standard optimization opportunities.
Try-catch blocks, however, do add subsequent code that can alter the execution of a program, but due to the complex, unpredictable, and difficult to account for changes in control flow that can accompany them, try-catch block exceptions traditionally have not been addressed by standard optimization frameworks. That is, the prior art approach has been to skip and not optimize sections of code containing try-catch blocks. Accordingly, it is an object of the present invention to provide a method for integrating try-catch blocks into CFGs so that optimizations can be done even in the presence of try-catch exceptions.
These and many other objects and advantages of the present invention will become apparent to those of ordinary skill in the art from a consideration of the drawings and ensuing description of the invention.
The present invention is directed to the optimization of Java classfiles by allowing for optimizations to take place in the presence of try-catch blocks.
According to the present invention, an IR is generated from an input bytecode file. Next, each basic block in the IR is scanned to identify the presence of try blocks. Each try block is subsequently split into two halves. Edges are then established between the two halves, between the first half and the catch block, and finally between the catch block and the second half. For all backwards dataflow problems, the basic try blocks are split at the last instruction that can throw an exception.