In computing, benchmarking is the act of running a computer program, a set of programs, or other operations, in order to assess the relative performance of an application or program, normally by running a number of standard tests and trials against it. The term benchmark may also be used to describe specially-designed benchmarking programs themselves.
Benchmarking may often be associated with assessing performance characteristics of computer hardware, for example, the computational performance of a processing device. There are, however, circumstances where the technique is also applicable to software. Software benchmarks are run against computer applications or programs, such as compilers, middleware, database management systems, etc. to asses their performance. A benchmark may be designed to mimic a particular type of workload on a piece of software under certain conditions. Thus, benchmarks provide a method of comparing the performance of various programs across different system architectures or configurations.
In some cases, the results of a benchmark are compared for software provided by different vendors to compare their performance. Identifying one product that performs better than another provides many advantages. However, the results of a benchmark may be affected by other factors than just the particular piece of software being tested. Any variation in the physical hardware used to run the benchmark, the configuration or optimization of the system (e.g., the operating system), or the configuration of any resources used by the application under test (e.g., a database) may affect the results of the benchmark. For example if one benchmark is performed for a program from a first vendor on a machine with a much faster processing device, it may appear that the performance of that program is superior, when the performance increase is really attributable to the processing device. If the benchmarks for different pieces of software are run at different times, in different locations, on different machines, or by different testers, it may be difficult, if not impossible, to replicate the same conditions for each benchmark. This may lead to inconsistent results that do not accurately compare the relative performance of different software applications or programs.