1. Field of the Invention
This invention relates to the field of object-oriented programming languages for computer programs, and, in particular, to the detection of mutability of fields and classes in an arbitrary program component.
2. Description of the Related Art
When it was introduced in late 1995, the programming language Java took the Internet by storm. A primary reason for this was the fact that Java was an interpreted programming language, which meant essentially that it used a different compiling/execution paradigm than programming languages such as C or C++. A program written in a high-level programming language, such as C or C++, which can be read, written, and understood by humans, needs to be translated into machine code that can be understood by the computer that actually runs the program. This is what a compiler does. In addition, compilers optimize the code as well as translating it. The end product of compiling, the machine code, is, by definition, machine specific, meaning that the code is uniquely addressed to the type of computer that is running it, and will not be understood by a different type of computer. A simple example of this is the fact that a program that has been compiled for a Apple Macintosh will not run on an International Business Machines (IBM) clone PC computer. This is called being “platform-dependent”.
On the other hand, interpreted programming languages, such as Java, are not compiled for a particular type of computer. They are platform-independent. This is done by placing an intermediary, the Java Virtual Machine (JVM), between the compiled program and the specific platform. In other words, when a Java program is compiled, the end result is not machine code, but byte-code, which is understood by the JVM. The JVM is machine specific, and acts as an interpreter of the byte-code for the particular machine the JVM is installed in. This allows Java programs to be compiled and ported to any machine, as long as the machine has a JVM installed.
It is this platform-independence that makes Java uniquely suited to the Internet. Once a computer has JVM installed, it doesn't matter whether the computer is Apple, Wintel PC, Sun, Digital, etc., a Java compiled byte-code program downloaded over the Internet will run on it. Although Java is generally run as an interpreted programming language, it should be noted that it can be optimized and compiled statically or during runtime (i.e., Just-In-Time compilers).
Java is an Object-Oriented Programming (OOP) language. This means that the focus is on objects, rather than on procedures (as in C or BASIC). Roughly speaking, an object contains data and the methods that operate on that data. Programming in Java can be understood as writing descriptions of different objects.
More particularly, in OOP, a “class” is a collection of data and methods that defines the implementation of a particular kind of object. A class definition defines “instance variables” and “class variables”, as well as specifying the “interfaces” the class implements and the immediate “superclass” of the class. In broad terms, a class can be understood as a general definition, and an object is an “instance” of a class. For example, class named Circle might be defined, with variables for radius and the location of the origin point. A particular circle c might be instantiated, with particular values for the radius and origin location, by calling on the Circle class. Because the radius and origin location are particular to that instance c of the Circle class, they are “instance variables”. By contrast, a “class variable” is a data item associated with the class as a whole. For example, the value pi=3.14 might be a class variable in the Circle class. Another example would be a variable num_circles which is defined in the Circle class, and which is increased by one every time a circle is instantiated. These class variables are associated with the whole class, rather than an instance, and are declared with the modifier static. Classes in Java form a class hierarchy, where a class may be a “superclass” or a “subclass” to another. For instance, Shapes might be a superclass of Circle, and GraphicCircle, a class that provides the ability to manipulate and draw instantiated objects of the Circle class, could be a subclass of Circle. A subclass inherits behavior from its superclass.
In Java, a “package” is an extensive set of classes, and Java has default packages that programmers use for common tasks. For example the java.io package has classes that handle input and output, the java.net package has classes that support networking functionality, and the java.awt package provides classes that create graphical user interface components.
Continuing with some of the unique features of Java, it should be noted that Java is a dynamic language. This means that any Java class can be loaded into a running Java interpreter at any time. These dynamically loaded classes can then be dynamically instantiated. Java is also a language built for networking. Using the java.net package, it is as easy to access files or resources over a network as files or resources located locally. Because Java is both dynamic and built for networking, it is possible for a Java interpreter to download and run code from across the Internet. This is what happens when a web browser downloads and runs a Java applet (an applet is a class that is loaded and run by an already running Java application). Presently, Internet Java applets are the ubiquitous use of Java, but Java has the capability of creating any type of program that dynamically uses the distributed resources of a network.
Because of the inherent security risks involved in a system that can download active code over a network, Java has several lines of defense against malicious code. First, Java, unlike C or C++, has no pointers, which can be used to access memory outside the bounds of a string or an array. Related to its lack of pointers, Java disallows any direct access to memory, thus stopping any security attack from that direction. Second, the Java interpreter performs a byte-code verification process on any untrusted code it loads, which prevents malicious code from taking advantage of implementation weaknesses in the Java interpreter. Third, Java uses a security “sandbox model”, where untrusted code is placed in a “sandbox”, where it can play safely, without doing any damage to the full Java environment. When an applet is running in the sandbox, there are numerous security restrictions on what it can do. By this means, rogue code is prevented from interfering with other applications running in the same Java environment, or gaining unauthorized access to resources in the underlying operating system or network. A fourth layer of security can be provided by attaching digital signatures to Java code. These digital signatures can establish the origin of the code in a cryptographically secure and unforgeable way. A user specifies whether a particular source is trusted, and, if code is received from a trusted source, it is accepted and run.
Another feature of Java is its method of memory allocation and deallocation. In C or C++, the programmer allocates memory and then deallocates memory in a deliberate fashion. In other words, the C++ programmer explicitly allocates memory for holding arrays, variables, etc. at the beginning of an object or method, and then explicitly deallocates that memory when it will no longer be used. By contrast, the Java programmer neither allocates nor deallocates memory. Instead, Java uses garbage collection, which works as follows: the Java interpreter knows what objects it has allocated. It can also figure out which variables refer to which objects, and which objects refer to which other objects. Because of this, it can figure out when an allocated object is no longer referred to by any other object or variable. When such an object is found, it can be safely destroyed by a “garbage collector”.
Lastly, Java uses components, application-level software units which are configurable at deployment time. Currently, there are four types of components: enterprise beans, Web components, applets, and application clients. Enterprise beans implement a business task or business entity. Web components, such as servlets, provide services in response to requests. Applets, as mentioned before, typically execute in a web browser, but can execute in a variety of other applications or devices that support the applet programming model. Application clients are first-tier client programs that execute in its own Java Virtual Machine. Components are provided life cycle management, security, deployment, and runtime services by containers. Each type of container (Enterprise Java Bean (EJB), Web, Java Server Page (JSP), servlet, applet, and application client) also provides component-specific services.
As made clear from the above description of Java, an essential attribute of Java is the localization of knowledge within a module, which is known as “encapsulation”. Because objects encapsulate data and implementation, the user of an object can view the object as a black box that provides services. Instance variables and methods can be added, deleted, or changed, but as long as the services provided by the object remain the same, code that uses the object can continue to use it without being rewritten.
However, problems occur when one object or component depends on the state of a shared variable or object and another component or object changes the state of that variable or object. In this case, in other words, the shared object is not encapsulated. This is sometimes known as an isolation fault. The mechanism for sharing state in Java is via class variables, i.e., fields declared with the static modifier. A class variable is accessed via the class name, rather than via an object reference. Thus, the variable is considered to be shared by all the code that can access the declaring class.
These isolation faults are of particular importance because of the rapid development of the Java component (applets, servlets, Java Beans and Enterprise JavaBeans) market and the use of Java to develop middleware, such as the AppletViewer used by web browsers to run applets, Java Server Toolkit (JST) to run servlets on servers, and Containers to run EJBs. The reference implementations of these middleware systems are based on the concurrent execution of multiple components in a single instance of the Java runtime system. The Java runtime system is the software environment in which programs compiled for the JVM can run. The runtime system includes all the code necessary to load programs written in the Java programming language, dynamically link native methods, manage memory, handle exceptions, and an implementation of the JVM, which may be a Java interpreter.
Isolation faults among multiple concurrently or serially executing programs can lead to numerous problems, especially in the areas listed below:
Integrity—some state information held in global fields/objects can be modified by any program running in the JVM. One such example is the default locale (country, language, variant). If two or more programs depend on this global state information, and they both try to change the default value, results of their execution are likely to be unpredictable.
Security—the ability to change states or observe state changes leads to security exposures. In some cases, global fields that belong to classes that may hold, at run-time, references to objects that are instances of subclasses that define overriding methods. These methods may perform operations that are unintended by the application developer, and may result in malicious behavior (e.g., opening a GUI window with a userid/password prompt). Also, malicious code can change the state of the Java runtime in unpredictable ways. An actual implementation problem in the Java Development Kit (JDK) that occurred in version 1.1.1 was due to object sharing. As a result, an unprivileged applet was able to impersonate a trusted signature, causing a serious security fault.
Compliance with the Component Model—application code may run into scalability problems. Often application code will use global variables to share state information between instances of the class. The problem is that in some of the application models, an instance of an EJB may be created in one container, retired to secondary storage, and then reactivated in a different container. When reactivated, the state information of the class variable/instance variable is stored in a different container. The net result is that there may be memory leaks—information is created and stored in variables, but never released—and the EJBs are no longer location transparent.
Therefore, there is a need to identify mutable variables, those variables that can be changed by more than one component, in order to identify and stop isolation faults.