Modern optimizers use feedback-directed optimization to provide for the generation of better object code. The object code being optimized is generated, is executed using training data, and then is re-generated using information gathered during execution of the object code with the training data in order to optimize the object code for that particular training data. Feedback-directed optimization increases the performance of generated object code by providing the compiler with information concerning the type of data, the training data, that will be processed by the code.
FIG. 1 illustrates the process of feedback-directed optimization used to optimize an object code program, according to the prior art. The first step 10 is to compile the program with instrumentation. Source code 12 is compiled by compiler and optimizer instrumentation 14 to generate annotated object code 16. The second step 20 is to run the program with the training input data. Annotated object code 16 is run with training data 22 to generate execution statistics 24 and program output 26 for the program run with particular training data. The third step 30 is to optimize the program based upon the generated execution statistics. The execution statistics 24 generated in step 2 and the annotated object code 16 generated by step 1 are used by the optimizer 32 to generate optimized object code 34. The fourth step 40 is to execute the program with actual, rather than test, input data. The optimized object code 34 generated in step 3 is executed with the actual input data 42 to generate optimized program output 26'.
It must be noted that if the actual input data 42 used in step 4 differs significantly from the training data 22 used in step 2, optimization of the object code will be impeded and performance of the executed object code will correspondingly suffer. The closer the training data 22 is to the actual data 42 the better the optimized object code 34 will be. Thus, the success of the feedback-directed optimization depends in large part on the quality of test data used to generate execution statistics.
The feedback-directed optimization of the prior art has several shortcomings. First, as discussed above, the optimized object code 34 is optimized only for specific test data 22 that may be representative of a particular revision of a specific processor architecture and revision level. If the optimized object code 34 is later run on a different processor architecture or a different revision level, it will no longer be optimized object code for the different architecture or revision level. There is therefore an unmet need in the art to be able to be able to easily and readily optimize object code that may be run on different processor architectures and revision levels as required.
Second, the optimized object code 34 is generated based upon given training data 22 and is therefore optimized for that particular training data 22. If there are data sets which the program may be expected to execute that are substantially different from the training data, the object code generated will be non-optimal. This, of course, indicates that it would always be preferable to use training data 22 that matched or at least closely resembled the actual data 22 to be used with the object code.
Given pragmatic considerations, however, the actual data will not always be available to serve as the training data 22 for the feedback-directed optimization method of the prior art. Due to confidentiality concerns, software vendors are generally unwilling to make the source code of their software applications available, and their customers are likewise unwilling to provide proprietary data to the software vendors for use as training data. In situations such as this, the final executable object code provided to the customer has not been trained on the customer's data and is therefore not optimized object code. There is therefore an unmet need in the art to be able to generate optimized object code without the need for using training data to do so.
Third, the feedback-directed optimization method of the prior art requires access to the program source code. As mentioned previously, software vendors are understandably reluctant to make the source code of their software applications available and thus feedback-directed optimization may not be a feasible option for object code optimization. There is therefore an unmet need in the art to be able to generate optimized object code even where there is no access to the program source code.
Fourth, optimization of executable object code using the feedback-directed optimization method is a static approach that occurs when the optimized object code is generated and before the optimized object code is even run with actual data. Due to the static nature of the optimization using the feedback-directed method, the object code is not capable of being dynamically optimized in real-time as the program itself is executing the object code. Dynamic optimization of object code during execution of the program provides the obvious advantage of ensuring that the program is optimized for the actual data being run, even if the data changes. There is therefore another unmet need in the art to be able to dynamically optimize object code of a program in real-time as the program is being executed.
Fifth, the complexity associated with the multi-stage process build required for feedback-directed optimizations is a cumbersome approach that discourages potential users from using it. As shown in FIG. 1, the feedback-directed optimization of the prior art requires four separate steps in order to run a program with optimized object code. These steps must be performed and supervised by the user. Further, as discussed above, even after performing all of these steps, there is no guarantee that the optimized object code that is generated will in fact be optimal for actual data to be run with the program. There is an unmet need in the art to be able to able to optimize object code with minimal complexity and supervision required of the user of the program.