1. Field
Embodiments of the invention relate to the field of computers; and more specifically, to the field of programming and executing code with a runtime.
2. Background
Object-Oriented Programming
Object-oriented programming is a computer programming paradigm. The idea behind object-oriented programming is that a computer program may be seen as comprising a collection of individual units (called objects or instances) that act on each other, as opposed to a traditional view in which a program may be seen as a collection of functions, or simply as a list of instructions to the computer. An object is a language mechanism for binding data with methods that operate on that data. Each object is capable of being called through methods, processing data, and providing results to other objects. Each object can be viewed as an independent machine or actor with a distinct role or responsibility.
A reflective object-oriented language is a programming language that has a particular set of characteristics (e.g., classes, objects/instances, inheritance, reflection, etc.), whereas a reflective object-based language is sometimes used to label a programming language that has some subset of those characteristics (e.g., objects). For purposes of this document, the phrases “object-oriented source code” and “object-oriented code” will be used to refer to code written in a language that has such characteristics (e.g., code written in a reflective object-oriented language, code written in a reflective object-based language). While procedural languages, non-reflective object-oriented languages, and non-reflective object-based languages are programming languages that do not typically support such characteristics, transformation techniques may be used to provide such characteristics (e.g., through emulation) to code properly written in such languages; and thus, such techniques transform such languages into a reflective object-based language or reflective object-oriented language. (These techniques need not emulate all characteristics of object oriented or based languages, but may emulate only those characteristics which are of interest to the rest of this document) For purposes of this document, the phrases “object-oriented source code” and “object-oriented code” will also be used to refer to such transformed procedural, non-reflective object-oriented, and non-reflective object-based language code. By way of example, and not limitation, this document primarily describes object-oriented source code written in a reflective object-oriented language. Also, the terms object and instance are used interchangeably herein.
Used mainly in object-oriented programming, the term method refers to a piece of code that is exclusively associated either with a class (called class methods, static methods, or factory methods) or with an object (called instance methods). Like a procedure in procedural programming languages, a method usually consists of a sequence of statements to perform an action, a set of input parameters to parameterize those actions, and possibly an output value of some kind that is returned.
When programmers write a program using an object-oriented language, the resulting code can be conceptually viewed as including four basic types of code. The first type includes commands that operate on input instance(s) to provide output instance(s) (referred to herein as “transformation” code); typically written as methods (referred to herein as “transformation” methods). The second type includes instance instantiation commands that cause the runtime to instantiate instances of classes (referred to herein as “instance instantiation” code). The third type includes property manipulation commands (referred to herein as “data preparation” code) to invoke property methods (accessors, mutators, etc.) of the above instances. The fourth type includes sequences of commands that cause method invocation sequencing using the appropriate instances (where the appropriate instances include the instances to use as arguments, the instances to be used by instance methods, and the meta class instances used by class methods) to specify what transformation methods of what instances to invoke, in which order, and with which parameters of which instances responsive to the changes made by data preparation code (referred to herein as “manual invocation sequencing” code). The manual invocation sequencing code is sometimes written as methods separate from the transformation methods, and thus the manual invocation sequencing code includes sequences of invocation commands for the transformation methods. A program typically iterates between data preparation code and manual invocation sequencing code (which may also dip into the instance instantiation code), which in turn invokes transformation code (which may also dip into the instance instantiation code and data preparation code types). It should be noted that this is a conceptual representation of a program, and thus, should not be taken as an absolute with regard to how to view a program.
Runtime
The term runtime is used herein to refer to a program or library of basic code that runs other code written in the same and/or a different language. Thus, a runtime is a collection of utility functions that support a program while it is running, including working with the operating system to provide facilities such as mathematical functions, input and output. These make it unnecessary for programmers to continually rewrite basic capabilities specified in a programming language or provided by an operating system. Since the demarcation between a runtime and an operating system can be blurred, the term runtime is used herein to refer to code separate from the operating system and/or code that is part of the operating system.
Early runtimes, such as that of FORTRAN, provide such features as mathematical operations. Other languages add more sophisticated features—e.g., memory garbage collection, often in association with support for objects. More recent languages tend to have considerably larger runtimes with considerably more functionality. Many object-oriented languages also include a system known as the “dispatcher” and “class loader.” The Java Virtual Machine (JVM) is an example of such a runtime: it also interprets or compiles the portable binary Java programs (byte-code) at run time. The common language runtime (CLR) framework is another example of a runtime.
Programming and Execution Framework
One framework within which applications are provided to end users includes three basic divisions. The first division includes the creation of the operating system and runtime. This first division is performed by programmers with highly advanced programming skills. When working in this division, programmers are respectively referred to as operating system programmers and runtime programmers. When creating a runtime for an object-oriented language, the runtime programmers include support for executing the various types of commands used in transformation code, instance instantiation code, data preparation code, and manual invocation sequencing code (e.g., instance instantiation commands, data preparation commands, and method invocation commands).
The second division includes the creation of object-oriented application source code to be run by the runtime. The second division is again performed by programmers with highly advanced programming skills, as well as an understanding of the business objectives of the application. When working in this division, programmers are referred to as application programmers. When creating an application in an object-oriented programming language, the application programmers write the specific transformation code, instance instantiation code, data preparation code, and manual invocation sequencing code for the specific application being created. As part of this, if the application requires a graphical user interface, the application programmers also design and code the graphical user interface for the specific application; and thus are also referred to as application designers.
The third division includes the use of application programs being run by the runtime. The third division is performed by end users that need not have any programming skills.
Manual Invocation Sequencing Code
The greatest costs typically associated with the creation of an application involve the debugging and/or optimization of the manual invocation sequencing code. For each opportunity for data to change, the application programmer must consider its effect and write manual invocation sequencing code to cause the appropriate transformation methods of the appropriate instances to be invoked in the appropriate order with the appropriate inputs. Exemplary mistakes made by application programmers include: 1) invoking the appropriate transformation methods of the appropriate instances in the wrong order; 2) forgetting to include commands to cause the one or more required transformation methods of instances to be invoked responsive to some data being changed; 3) including commands to cause unnecessary transformation methods of instances to be invoked responsive to some data being changed (e.g., including commands to invoke transformation methods of instances that are not affected by the change in data), etc.
By way of example, one technique of generating manual invocation sequencing code is the use of the observer pattern (sometimes known as “publish subscribe”) to observe the state of an instance in a program. In the observer pattern, one or more instances (called observers or listeners) are registered (or register themselves) to observe an event which may be raised by the observed object (the subject). The observed instance, which may raise an event, generally maintains a collection of the registered observers. When the event is raised, each observer receives a callback from the observed instance (the observed instance invokes a “notify” method in the registered observers). The notify function may pass some parameters (generally information about the event that is occurring) which can be used by the observers. Each observer implements the notify function, and as a consequence defines its own behavior when the notification occurs.
The observed instance typically has a register method for adding a new observer and an unregister method for removing an observer from the list of instances to be notified when the event is raised. Further, the observed instance may also have methods for temporarily disabling and then reenabling calls to prevent inefficient cascading of a number of related updates. Specifically, callbacks called in response to a property value change often also change values of some other properties, triggering additional callbacks, and so on.
When using the observer pattern technique, application programmers writing manual invocation sequencing code specify what methods of what instances to call, in which order, and with which inputs by registering, unregistering, disabling, and reenabling observers to different observed instances, as well as writing the notify and callback methods for each. More specifically, the relationship between observer and observed instances is locally managed (by the observed instance alone, without synchronization with other observed instances) within the observer pattern, and thus the manual invocation sequencing code needed to synchronize events from multiple observed instances is typically part of the specific callback methods of each observer.
Overwriting, Volatile Call Stack
Typical runtimes use an overwriting, volatile call stack to track currently invoked, uncompleted calls. An overwriting, volatile call stack is overwriting in that it pops off and discards entries as each call is completed, and volatile in that it is discarded and rebuilt on every execution. Typical runtimes use overwriting, volatile call stacks because typical runtimes combine the building of the overwriting, volatile call stack with the actual invocation of the appropriate transformation methods of the appropriate instances with the appropriate inputs responsive to execution of the manual invocation sequencing code. In sum, responsive to execution of manual invocation sequencing code, a typical runtime determines the transformation method of instance sequencing call by call (as each call is made) and maintains the overwriting, volatile call stack to track only currently invoked, uncompleted calls.
Program Execution and Parallelization
Conventionally, methods in a program are executed sequentially based on the manual invocation sequencing code. To improve the efficiency and speed of execution, some methods may be executed in parallel in systems that support parallelization. In general, parallelization in computing is the execution of multiple processes, tasks, or threads, simultaneously. To implement parallelization, application programmers may identify methods that are desired to be executed in parallel, and then rewrite the manual invocation sequencing code to cause the methods identified to be executed in parallel.
Currently, common parallelization mechanisms supported in computing include multiprocessing and multithreading. In multiprocessing, an application program is typically divided into multiple tasks. Each task is a logically high level, discrete, independent section of computational work executable by a processor. To achieve parallization, at least some of the tasks are executed on multiple processors simultaneously. The processors may be coupled to each other via a network and be collectively referred to as a grid. The processors in the grid may include local processors, distant processors, or a combination of both.
Besides multiprocessing, another common parallelization mechanism is multithreading. A thread is a local process to execute a task. A processor that supports multithreading may execute multiple threads substantially in simultaneously. One example of such a processor is a multi-core processor, where each core of the multi-core processor may execute a thread.
By way of example, one conventional technique in parallelization is to analyze the source code of an application program to extract a configuration of the application program. Based on the configuration, the application program is divided into a number of sub-programs, which are presented in a graph based on the sub-programs' parent-child relationships. These sub-programs are executed in parallel based on the sub-programs' parent-child relationships.
In some conventional computing system, analysis of the intermediate code generated from the source code may be performed to achieve parallelization. For example, analysis of intermediate code (e.g., assembly language) and parallelization is done during compilation. A parallelizer of the compiler converts the intermediate code into a parallelly executable form. An execution order determiner determines the order of the basic blocks to be executed. An expanded basic building block parallelizer subdivides the basic building blocks into execution units, each made up of parallelly executable instructions. Analysis of dependency is done on an instruction basis.
However, the conventional techniques described above all require analysis of the manual invocation sequencing code in the application program, which is written by application programmers. Thus, the burden of parallelization is put onto the application programmers because great care has to be taken when writing the manual invocation sequencing code in order for the parallelization to be performed correctly. Thus, the application programmers need to possess a relatively high level of programming skill.
To make the job of application programmers easier, some conventional techniques have been developed to perform parallelization of application programs without requiring high level of programming skill. For example, special language constructs and special wrapper classes around regular data types are provided to execute a sequential program in parallel. Programmers are not required to write a “parallel program” in order to have parallel execution of parts of the program. A parallel procedure is specified at calling point by specifying a parallel procedure identifier and its arguments to the system. Execute parallel function to execute different parts in parallel is provided. Parallel procedures may be written by making a new class derived from a common class corresponding to each parallel procedure in the program. The system resolves dependencies at run time and parallelization is done to the level where actual dependency is encountered. The compiler may determine whether arguments can be modified in the parallel procedure through analysis of the control flow graph of the parallel procedure.
In another conventional computing system, a database manager is used in executing user-defined functions in an application program without the need of hard-coding all the parallelism support in the computer program itself. A database table is defined with instructions the user wants to execute in parallel. A user-defined function is then defined that executes the instructions in the table. The database manager provides parallelism by executing multiple tasks in parallel in the user-defined function.
Software Instrumentation
In general, software instrumentation refers to techniques for observing the behavior of one or more application programs and collecting metrics relevant to the application programs and the execution thereof. Thus, software instrumentation is a valuable tool in development as well as maintenance of an application program as the application program and/or the execution of the application program may be improved in various ways based on the behavior of the application program and the metrics collected.
Currently, various techniques have been developed to implement software instrumentation. For example, one technique is to add software modules or code to record the execution history of an application program such that future execution of the application program may be managed based on the execution history recorded. In another example, a compiler generates instruction and metadata for monitoring and collecting metrics. If a selected indicator is associated with an instruction, counting of events associated with the execution of the instruction is enabled. Then the number of times an instruction is executed is counted. After execution of the application program, hotspots are identified to determine performance improvement methodology and source code of the application program may be modified accordingly to implement performance improvement methodology.
Another conventional technique in instrumentation is to use the intermediate representation (IR) data generated from the source code of an application program. Specifically, a compiler generates IR data from source code. A code instrumentation module acts on the IR data to construct an IR tree and to add instrumentation to the IR data based on the IR tree. Then the compiler finishes compilation by converting the IR data with instrumentation into object code. A class instance can be instrumented using an instrumentation library (hereinafter, an instrumentation DLL). A virtual machine (VM) runtime module may run the instrumented class instance. There are declarations of method names and parameters in the byte code in the class instance. A special designator indicates that the executable portions correspond to the declared methods are found in some blocks of native code separate from the VM runtime module. For example, the instrumented Java VM byte code may be monitored during execution by a monitor process and a monitor library (a.k.a. a monitor DLL).
Object-Relational Mapping
Object-Relational mapping is a programming technique that links relational databases to object-oriented language concepts, creating (in effect) a “virtual object database.” Some object-relational mappers automatically keep the loaded instances in memory in constant synchronization with the database. Specifically, after construction of an object-to-SQL mapping query, first returned data is copied into the fields of the instances in question, like any object-SQL mapping package. Once there, the instance has to watch to see if these values change, and then carefully reverse the process to write the data back out to the database.
Hibernate 3.0 is an object-relational mapping solution for Java and CLR (Jboss® Inc. of Atlanta, Ga.). Thus, Hibernate provides a framework for mapping an object-oriented domain model to a traditional relational database. Its goal is to relieve the developer from some common data persistence-related programming tasks. Hibernate takes care of the mapping from classes to database tables (and from object-oriented data types to SQL data types), as well as providing data query and retrieval facilities. Hibernate is instance centric and builds graphs representing relationships between instances.
Inversion of Control and the Dependency Inversion Principle
Inversion of Control, also known as IOC, is an object-oriented programming principle that can be used to reduce coupling (the degree to which each program module relies on each other module) inherent in computer programs. IOC is also known as the Dependency Inversion Principle. In IOC, a class X depends on class Y if any of the following applies: 1) X has a Y and calls it; 2) X is a Y; or 3) X depends on some class Z that depends on Y (transitivity). It is worth noting that X depends on Y does not imply Y depends on X; if both happen to be true, it is called a cyclic dependency: X can't then be used without Y, and vice versa.
In practice, if an object X (of class X) calls methods of an object y (of class Y), then class X depends on Y. The dependency is inverted by introducing a third class, namely an interface class I that must contain all methods that x might call on y. Furthermore, Y must be changed such that it implements interface I. X and Y are now both dependent on interface I and class X no longer depends on class Y (presuming that X does not instantiate Y). This elimination of the dependency of class X on Y by introducing an interface I is said to be an inversion of control (or a dependency inversion). It must be noted that Y might depend on other classes. Before the transformation had been applied, X depended on Y and thus X depended indirectly on all classes that Y depends on. By applying inversion of control, all those indirect dependencies have been broken up as well. The newly introduced interface I depends on nothing.
The Spring Framework is an open source application framework for the Java platform that uses IOC and dependency inversion. Specifically, central in the Spring Framework is its Inversion of Control container that provides a means of configuring and managing Java objects. This container is also known as BeanFactory, ApplicationContext or Core container. Examples of the operations of this container are: creating objects, configuring objects, calling initialization methods and passing objects to registered callback objects. Objects that are created by the container are also called Managed Objects or Beans. Typically the container is configured by loading XML files that contain Bean definitions. These provide all information that is required to create objects. Once objects are created and configured without raising error conditions they become available for usage. Objects can be obtained by means of Dependency lookup or Dependency injection. Dependency lookup is a pattern where a caller asks the container object for an object with a specific name or of a specific type. Dependency injection is a pattern where the container passes objects by name to other objects, either via constructors, properties or factory methods. Thus, the Spring Framework is memory centric and builds graphs representing relationships between instances.
Graphing Tools
Javadoc™ is a tool that parses the declarations and documentation comments in a set of Java source files and produces a corresponding set of HTML pages describing (by default) the public and protected classes, nested classes (but not anonymous inner classes), interfaces, constructors, methods, and fields (Sun Microsystems®, Inc. of Santa Clara, Calif.). Javadoc can be used to generate the API (Application Programming Interface) documentation or the implementation documentation for a set of source files. Javadoc is class and method centric and builds graphs representing the relationships between the combination of classes and their methods.
Another system for designing software applications includes graphs of objects analyzed by an interpreter to represent and reproduce a computer application. This system utilizes prewritten programming classes stored in code libraries, which can be written to follow the design patterns described in “Design Patterns” by Gamma et al, Addison Wesley 1995, “Patterns in Java” by Grand, Wiley Computer Publishing 1998, and/or high level Computer Aided Software Engineering (CASE) tools. More specifically, some such classes are based on the Observer behavioral pattern. The prewritten code libraries represent application state nodes, processing logic, and data flow of the system between various application states (i.e., the pre-written data elements of the application), so that a user need not write, edit, or compile code when creating a software application. Instead, a user manually edits a software application in a Graphical User Interface by editing visual objects associated with a current application state node, such as data within the application state node or processes performed within the application state node. Then, based on the changes made by the user to the current application state node, the interpreter displays the updated application state to the user for the application state which has just been edited. The system may then transition along a user-defined transitional edge to another application state where the user may optionally edit the next application state or the transitional edge. Changes to a graph may be made to instances of the graph which are implemented by the interpreter while the software application is running
This system for designing software applications may include visual representations of a running software application that can be made “usable” with an application controller. When a user changes visual objects, representing the running software application, the controller uses the input to induce the interpreter to make the change to the graph. The controller then waits for more changes. Further, visual representations of such software applications may be imported or exported as XML documents that describe the visual representation of the application, and thereby the software application.
In order to edit and/or create a software application, in the form of a visual representation of nodes, directed edges, and application states, an application program interface and an application editor may further be included in the system. Key words, and associated definitions, from the pre-written code libraries, enable application developers to manually define a software application, processing steps, as well as the visual representation of a software application by providing graphical representations, within an editor, of a graph application which closely correlates to the actual application structure. A user defines a new application through an “application definition wizard,” which after certain preliminary matters are fulfilled, displays the new application as a graph component within the editor workspace. A user further interacts with an application by making selections from displayed lists of pre-created possible application components and dragging and dropping components onto the workspace using a PC's mouse and keyboard. A user may select components and “drag” them over existing components. When a new component is “dropped” on an existing component, the new component becomes a child of the existing component within an application graph. The relationships of components within the application are manually defined by the user's selections within the editor. Thus a tree structure representing an application is built by the user. As the application is created, a user can select an application navigator viewer to display a tree view of the constructed application making it possible to select and edit any component of the application. The editor interface processes user inputs and selections including creating or deleting application elements, updating component attributes, and updating display properties of an application.
The system described above, while utilizing visual representations of software applications, may also be used as a visual programming tool for defining and updating relational databases. The system utilizes XML descriptions of visual representation of software applications. A tool parses and interprets the XML descriptions to produces equivalent relational database table schemas, as well as changes thereto. When data is changed within a visual representation of a software application, a description of the change is stored along with other changes in a journal file and then processed as a group. An intermediate program (a java application operating on its own thread) performs transactions between the visual representation of the software application and the relational database. The java application polls (i.e., checks) the journal of changes to nodes of the visual representation (i.e., data in database), and if there are changes, makes the changes to the database. Thus, by altering data within the visual representation, the system updates a database. A similar application stands between the visual representation of the software application and the database to handles requests for data from the database.
Another system for analyzing software is called a Code Tree Analyzer (CTA). A CTA analyzes static source code written in an object-oriented programming language. The CTA generates a symbol table and a call tree from the static source code. Using the symbol table, the CTA generates a class diagram. Likewise, using the call tree, the CTA generates a sequence diagram. The class diagram illustrates the relationship between a user selected class and classes related to the user selected class. The sequence diagram illustrates the sequence in which different methods are called. Using both the class diagram and the sequence diagram, the CTA generates a design artifact representative of the static source code. When the user modifies the design artifact, the CTA identifies impacted portions of the source code using the sequence diagram. The design artifact is used for code maintenance and/or reverse engineering of the static source code.
U.S. Pat. No. 5,966,072 describes use of a graph to invoke computations directly. Getting information into and out of individual processes represented on the graph, moving information between the processes, and defining a running order for the processes, are discussed. The described arrangement adds “adaptor processes”, if necessary, to assist in getting information into and out of processes.