Performance tools or other analysis tools, such as debuggers, instruction counters, profiles, fault-injectors, resource failure simulators, and optimizers often require or use additional information that describes the object code or executable code being analyzed. In general the purpose of analysis tools is to perform an analysis of the code and possibly produce something else such as the result of the analysis or new source or object code.
The additional information may be used to override default behavior for special cases or other exceptions. The additional information may also be used to instruct the tool to perform additional actions. This additional information augments the information that may be inherent in the object code file format and debug information that is or can be automatically generated by the computer, linker, and other tools involved in the process or generating the object code.
One conventional method of providing the additional information to tools is to pass the information on the command line when invoking the tool, in the environment of the computer, or in a command file that is processed by the tool either automatically or as specified by a analysis tool settings. Analysis tool settings include command line options, environment settings, or command files.
There are several problems with the conventional techniques. One problem is that the analysis tool settings exist independent from the source code from which the code is generated. When changes are made to the source code, changes may be necessary for the analysis tool settings too. The changes require a high degree of coordination since the people responsible for changes to the source may not be the same people that are responsible for changes to the analysis tool settings.
Another problem is that the references in the analysis tool settings to the code may depend on the target environment for which the object code is targeted. Examples of target environments are the microprocessor architectures of the Intel x86 or the Intel IA-64, or platforms, such as Microsoft Windows 2000 or Microsoft Windows CE. One example of this problematic dependency is wherein the names of C and C++ functions and data as identified in object code do not match the names used in the source code. To further complicate the situation, the names differ between architectures like the x86 and the Intel IA-64. The names can also embody different calling conventions for a single architecture like _cdecl, stdcall, and _fastcall, all of which are supported for the x86 processor by Microsoft's Visual C++. Supporting multiple target environments often requires multiple independent sets of analysis tool settings.
More specifically, where the source code is written in C++, factors beyond the calling convention may complicate the dependency of the analysis tool settings on the target environment. The names in object code generated from C++ are usually mangled names which are also known as decorated names. The purpose of the mangled names and/or decorated names is to include not only the name of the function or data being identified, but to also include its type (e.g. char foo is decorated differently than int foo). The type information that is included in the managed name and even the algorithm for generating the decorated name can vary between target environment also introducing instances where multiple sets of analysis tool settings are necessary.
Another problem related to mangled names is that different configurations of a single set of source code for a single target environment may have differences in the type of some item that results in different mangled names for the different configurations.
Yet another problem related to mangled names is that the names are meant to be meaningful to computer software but not to users. The mangled names sometimes are very cryptic. If users have to specify mangled names in their analysis tool settings the users are more likely to make errors than if the users are able to use the same names used in the source code.
Another problems is that names in the source code may not need to be unique. For example, C and C++ allow functions to be declared static. Distinction functions in different source code files may have the same name. The same is true of data. This is a further complication because it required the analysis tool to define a mechanism to qualify the name specified by the source code file or more likely object code file in which it resides. The name or path of the object code file may vary depending on the target environment introducing further complications. The name or path may also vary from machine to machine introducing yet another complication.
Still another problem occurs when a function is inlined at the position in the source code where the function is referenced, and the function loses its separate identify. When a function is inlined, the implementation of that function is expanded in the object at the point corresponding to a reference in the source. The object code for the inlined function no longer has an identity of its own. In many cases the author of the source code is not aware when inlining occurs. The lack of identity makes it difficult or impossible to refer to this inline instance in the analysis tool settings. That the compiler, linker, or other tools can perform inlining without the source code author's knowledge makes it possible that the author or analysis tool user may not be aware that an analysis tool setting is necessary.
Still yet another problem is related to C+  templates or similar source constructs. While the source code contains a single definition of the template item, the object code may contain many and the author of the source code may not be aware when the compiler instantiates template items or how these instances are identified in the compiler generated mangled names. There are more cases where it is difficult to identify the names in the analysis tool settings or the source code author or analysis tool user are unaware a setting is necessary. All of the problems described above can be avoided by taking the information that is provided in the analysis tool settings and annotating the source code. By having the annotations in the source the problems associated with having to know the names to specify in the analysis tool settings or having to maintain multiple sets of analysis tool settings for different target environments or build configurations is avoided.
Another convention solution is annotating source code from the information that is provided in the analysis tool settings.
There are several possible forms of source code annotations. One form is source code comments like /**/ or // in C and C++. This form of annotation is used in some source code analysis tools. By having comments of a special form the annotation is recognized by the analysis tool but treated as a comment by the compiler as intended in the language definition. There are a number of problems that make this unsuitable for annotating object code. One problem is that compilers are designed to ignore the comments, though a compiler could be designed to recognize some of these annotations on its own and handle them in some special way. Another problem is related to the capability of preprocessing C and C++ source and subsequently expanding #include file references and macro definitions. The resulting files usually have comments removed which results in the loss of all the source code annotations. Another problem is that annotations expressed in comments can't make use of the preprocessing facilities of languages like C and C++ which is important to allow annotations to vary according to target environment, build configuration, or any other factor desired by the source code author and/or analysis tool user.
Another form of annotation is compiler pragma like the #pragma directive in C and C++. This form of annotation has problems because it is source line oriented and can not appear at arbitrary locations in the source code. It too has limitations with regard to preprocessing. A #pragma directive can not appear within a C/C++ macro that can then be placed where desired in the source code.
Another form of annotation is to insert source code that gets translated into object in the generated object file. For example, C runtime printf calls may be added as a form of debugging aid or trace of program execution. A resulting problem is that for any particular configuration, the annotation code is present in the resulting executable. The executable code is often considerably larger because of this. There may also be overhead in execution time because of the additional object code. The overhead is avoided by having multiple build configurations that differ only in whether or not the annotations are present in the object code. At best this is additional overhead to the developer to maintain and produce two different configurations. At worst it defeats the purpose of the analysis tool for which the intent is to analyze the result of the build configuration without the additional code.