Traditionally, software creation tools produce binary code. The binary code produced by the software creation tools typically includes everything an operating system needs to run the code, but little else. For example, binary code typically contains the code's machine instructions, information as to where to place the instructions in memory and so on. Additional information that describes characteristics of the binary code such as the names of the various functions in the code, the types of parameters expected by each function, the layout of the data types used by the code, and so on, is not present in typical binary code, although this information is present in the original source code. Hence, the conversion from source code to binary code is a lossy process. It produces a result that can be considered relatively opaque, meaning that it is difficult to reconstruct the lost information by looking at the binary code.
The additional information that describes characteristics of the binary code is sometimes called metadata: a generic term for data that describes other data, where in this case, the described data is the binary code. Without the metadata, it is difficult, sometimes impossible, for other software, such as development tools and compilers, to determine what the binary code contains, does, or is expected to do. For example, by examining the binary code, a tool typically cannot determine: what data types the code defines, what methods the types define, the contract that a particular method is attempting to satisfy, how software debugging tools are to display the data types, how software analysis tools are to analyze the data types and methods and so on.
In contrast, systems that compile code to an intermediate byte code representation typically place considerably more information into the resulting output than just the binary code. The container that contains the intermediate byte code representation is sometimes called an assembly or in Java, a Java class file or a JAR file (a Java Archive file, a zip file containing multiple class files). The term “assembly” as used herein refers to any such container of byte code and metadata.
In addition to the intermediate byte code, an assembly thus may include additional information, (metadata), which describes aspects of the binary code itself. An assembly as known today includes metadata describing the class, its fields, methods, etc., as well as custom attributes on the members of the class. It also contains information on dependencies of the assembly, links to types that may have been moved out of the assembly, etc. Metadata enables other software to retrieve information about the intermediate code. Therefore, an assembly is more transparent (meaning that information about the binary code is more discoverable) to tools examining the assembly than is a traditional binary code module.
The additional metadata and its resulting transparency, however, often is accompanied by undesirable costs. For example, the metadata intended for use by tools and code designers but not used during runtime may occupy memory during runtime. Running code may require the intermediate byte code implementations of a type but the intermediate byte code implementations of the type may not be needed at design time or by many non-runtime tools.
In known systems, all potential metadata has to be known when compiling the source code into intermediate byte code to create the assembly. Most of the code and metadata for an assembly typically resides in a single repository: the binary assembly itself, although additional metadata may also reside in a separate program database (PDB) repository. A software developer creates these code and metadata repositories for an assembly when the developer compiles source code into the desired assembly. As a result, only the original author of an assembly may be able to create or modify the assembly. Security features typically invalidate code that has been changed. The ability to author, organize, access and modify code and metadata may also be limited by constraints imposed to support the literal compilation and execution of intermediate byte code. These constraints can also impact characteristics such as performance, security, etc., in undesirable ways.