Most often, software does not make the best use of the hardware on which it runs. This is due to many reasons. For one, most compiler optimizations are machine dependent. When generating an executable for general distribution software vendors must make decisions about what kind of machine the user is likely to be using. A vendor may choose to optimize his code for the newest processor available while another may opt for another approach. Beside the processor type, other elements of the users system such as the cache size and bus speed affect how the machine performs and therefore relate to how code should be optimized to best perform on that system. Additionally, a user's hardware configuration may change over time due to hardware upgrades. This further complicates the choices to be made in generating optimized code.
Another factor effecting how software should best be optimized is the manner in which the software will be used. That is, the functions utilized and the data set upon which the program operates effects how the code should be optimized. Again, software vendors are left with the task of determining the most likely uses and type of data set that will be used. As with optimizations directed to specific hardware, these choices to not optimum for all possible users of the software.
To generate an optimized executable, software vendors can utilize profile guided optimization. Historically, profile guided optimization in the compiler has been done by inserting instrumentation into the code to be optimized, compiling the code and then executing the instrumented code on a representative machine with a representative data set. The instrumentation provides feedback that allows the software vendor to make adjustments to the code to reach optimum performance on the test machine. Current systems to enable profile collection and usage in the compiler are tedious and consequently usage of profile feedback among software vendors is very low. A software vendor may not have access to representative program inputs. As explained above, the software vendor usually has to choose a single target machine configuration when optimizing the binary that is shipped. This choice is often non-optimal. Hence, to generate a high performance executable, knowledge of accurate usage profiles and the target machine is imperative. Ideally code should be optimized for an individual user and allow for changes over time due to hardware upgrades and changes in usage.