1. Field of the Invention
The present invention relates to object-oriented programming (“OOP”), and more particularly, to using flow-sensitive constraint analysis for reducing dead code.
2. Background
MRTE (managed runtime environment) is a platform that abstracts the specifics of an operating system and the architecture running underneath it. Instead of writing programs that directly command a processor, software developers write to a “runtime” that handles many of the generic tasks that programmers used to have to anticipate and build. Managed runtime can handle tasks like heap management, security, garbage collection, and memory allocation. This allows developers to concentrate on business logic specific to their application. Because of runtime's close relationship with the operating system and architecture, it's often called a “virtual machine.”
Several MRTEs have been commercialized, including IBM's SmallTalk™ language and runtime, Sun Microsystem's Java™ language and runtime, and Microsoft's .NET™ common language runtime (referred to as “CLR”).
Object-oriented programming languages used in MRTEs provide a number of features such as classes, class members, multiple inheritance, and virtual functions. These features enable the creation of class libraries that can be reused in many different applications. However, such code reuse comes at a price. In order to facilitate reusability, OOP encourages the design of classes that incorporate a high degree of functionality. Programs that use a class library typically exercise only a part of the library's functionality. Such a program may pay a memory penalty for library functionality that it does not use.
A library may contain dead executable code. Dead executable code is code that is not executed during execution of the program, or code whose execution cannot affect the program's observable behavior. Dead executable code in an application adversely affects memory requirements and is hence undesirable. Dead executable code may also take the form of unused library procedures.
Virtual functions are operations that are declared in a base class, but may have different implementations in subclasses. Typically, virtual functions count for a substantial portion of dead executable code. When program code that is written to operate on any object (either an object of the base class or any of its subclasses) makes method calls, the correct implementation of the method in question must be used. As a result, no fixed code address can be associated with that method call—a different address must be used depending on the particular (sub)class to which the object belongs. [S. Bhakthavatsalam, “Measuring the Perceived Overhead Imposed by Object-Oriented Programming in a Real-time Embedded System”, Blacksburg, Va., May 16, 2003].
Prior art has addressed the problem of eliminating some, but not all unused virtual functions. An example of such prior art is provided in the white papers by D. F. Bacon and Peter F. Sweeney, “Fast Static Analysis of C++ Virtual Function Calls”, IBM Watson Research Center, and by A. Srivastava, “Unused procedures in object-oriented programming”, ACM Letters on Programming Languages and Systems, 1(4), pp. 355-364.
Such prior art methods only partially eliminate non-virtual functions, and hence, the problem of dead code still remains.
Other prior art technique address eliminating virtual functions in MRTEs by performing inter-procedural analysis of object types; as discussed by I. Pechtchanski and V. Sarkar, in “Dynamic Optimistic Interprocedural Analysis, a Framework and an Application”.
Such techniques track object types at a global level and do not take into consideration the possible variance of object types based on specific instructions within specific functions called by specific execution paths. For example, such conventional techniques track a local variable type that may be limited to say A or B during the lifetime of the program but do not track a local variable type that is exactly B at a specific instruction following a specific execution path. For field access, such conventional techniques track say field F of class T that may be limited to A or B during the lifetime of the program, but do not track that instance Q of class T that has field F always set to B at a specific instruction following a specific execution path. In practice, this is very significant, as such prior art approaches yield exponentially larger sets of included functions that are never called.
Prior art also does not specify a mechanism for calling into native functions that may return variable types where it is not possible to analyze the native functions. Nor does it specify a mechanism for automatically handling dynamic-type-driven functions that may call functions indirectly by inspecting the type information (also referred to as metadata) at runtime. Prior art only provides a mechanism using manually-generated configuration files to specify which functions should be preserved (Sweeney, et. al, U.S. Pat. No. 6,546,551, “Method for accurately extracting library-based object-oriented applications”). Prior art fails to suggest an approach to automatically determine functions on the basis of local flow-sensitive type constraint analysis.
Conventional techniques are flow insensitive and not flow sensitive. Flow insensitive approach tracks variable types globally (at a program level), without giving any consideration to how a variable is used at a specific instruction of a specific function and call path.
These issues are magnified when considering modern MRTEs that specify extensive standard framework libraries with millions of virtual function calls and deep inheritance chains such as Microsoft's .NET®. Using prior art methods in practice, a program that calls into a simple framework function such as formatting a text string yields dependencies on thousands of downstream virtual functions that are never called effectively, and string formatting functions are included for every instantiated object type, regardless if these functions are actually used.
Modern MRTEs primarily reside in personal computers or handheld environments with enough memory to easily hold entire class libraries (in the order of 64 megabytes). Such environments are typically used in an environment where multiple application programs use common class libraries and underlying operating system functions. Because of the nature of these environments, it is beneficial for performance and interoperability to maintain a single set of class libraries that are shared across applications in their complete form. Thus, there is limited, if any, benefit of determining dead executable code in such environments.
However, dead code becomes a major problem for smaller systems, for example, embedded systems, because memory is extremely limited, and such devices typically perform a specific application and do need to use a massive single set of class libraries. An example of one such embedded system is the Lantronix XPORT™ sold by Lantronix Inc.
Therefore, there is a need for a system and method for efficiently interpreting a program function calls and minimizing dead code.