1. Field of the Invention
The present invention relates to a performance evaluation device, a performance evaluation information managing device, a performance evaluation method, a performance evaluation information managing method, a performance evaluation system, a storage medium, and a computer program product, and more particularly, the present invention is suitable for use as a performance evaluation device for successively measuring the performance of an information processing system under a plurality of different measurement conditions, and a performance evaluation information managing device for managing performance evaluation information on the information processing system which is obtained by the measurement.
2. Description of the Related Art
Conventionally, a performance measurement test called a benchmark test (hereinafter, referred to also as a ‘benchmark’) is adopted as a method of measuring the performance of an information processing system (information processing apparatus) in order to compare and evaluate the performance of hardware and software of the information processing system under various conditions. In the benchmark, a measurement condition consisting of a plurality of parameters of the information processing system to be measured is inputted and defined sequentially every time one benchmark is executed, and the information processing system executes the processing based on the defined measurement condition, thereby measuring the performance. Performance evaluation information (benchmark result data) obtained from the measurement is used to evaluate the performance of, for example, a newly configured information processing system and to judge whether or not the information processing system satisfies desired performance requirements, thereby evaluating the performance of the information processing system.
For example, when the performance of a mail server is to be measured as an information processing system, the number of CPUs that the mail server has, the number of users using the mail server, the data size of transmitted/received electronic mails, and so on are defined as the measurement condition. Then, according to the defined measurement condition, the same number of client terminals and operators as the number of the users are prepared and the operators manually access the mail server to measure the performance of the mail server. Alternatively, the defined measurement condition is converted to data (program) as a virtual process and the client terminals execute the virtual process to measure the performance of the mail server.
Further, when the performance of the information processing system is to be measured under a different measurement condition after being measured under a certain measurement condition, an operator inputs and redefines the new measurement condition to have the information processing system execute the processing based on the redefined measurement condition, thereby measuring the performance. In short, the measurement condition input work by the operator and the execution of the benchmark are repeated, thereby measuring the performance of the information processing system under the plural different measurement conditions. Thus, the performance of the information processing system is measured under the plural different measurement conditions, and the measurement results are evaluated and compared to determine the specifications of the mail server best conforming to usage conditions thereof.
In executing the conventional benchmark described above, however, the following problems exist.
-First Problem-
Every time the benchmark is to be executed, the operator defining the measurement condition is required to define the measurement condition and to carry out the input work of a new measurement condition and the correction work.
Therefore, even when the benchmarks are to be executed under the plural different measurement conditions respectively, the automatic execution of the successive benchmarks is not possible. In other words, the operator has to carry out the input work and the correction work for inputting and redefining a new measurement condition after confirming the end of the benchmark under one measurement condition, which causes the problem of extremely low efficiency in executing the performance measurement.
Further, for example, when the measurement conditions such as the number of CPUs that the information processing system has, memory size, and so on are changed to execute the benchmarks respectively, it is conventionally necessary to insert or remove a CPU, a memory, and so on according to the measurement conditions. Therefore, also when the number of the CPUs, memory size, and so on are changed, the operator has to insert or remove a CPU, a memory, and so on in order to define the new measurement condition after confirming that the benchmark under one measurement condition is finished.
Moreover, depending on the kind of the benchmark and the measurement condition, the processing (the execution of the benchmark) sometimes takes many hours. For example, the execution result of the benchmark is sometimes obtained (the benchmark is finished) late at night. In such a case, since the operator is often absent, a next measurement condition is not inputted until the operator's operation next morning, thereby causing the problem of a waste of time.
-Second Problem-
In defining the measurement condition, a different measurement condition is defined for each information processing system (application of the information processing system) to be measured. For example, when the performance of a mail server is measured as an information processing system, measurement conditions such as the data size of transmitted/received electronic mails, the number of accessing users, access frequency by the users, and so on are defined. When the performance of a web server is measured, measurement conditions such as the number of simultaneous accesses, the data size of one content provided to the users, the existence or nonexistence of an applet and a servelet, and so on are defined.
Since a different measurement condition is thus defined for each information processing system, as for a user interface used by the operator at the time of inputting the measurement condition, each information processing system has a different and specialized user interface. Therefore, when the same operator tries to execute the benchmarks for various different information processing systems, the operator has to define the measurement conditions using different user interfaces for respective information processing systems, which causes the problem of complicated work and lack of convenience.
-Third Problem-
As stated in the above first problem, in order to execute the benchmarks under the plural different conditions, the input work for each item in the measurement condition has to be repeated even when only a part of the measurement condition is changed. This results in extremely low work efficiency and sometimes causes input mistakes of the measurement condition and the like by the operator. Even when the different part in the changed measurement condition is inputted again, if there are many combinations of the measurement conditions, the operator sometimes forgets to correct the measurement condition for which the re-input is necessary or forgets to input some measurement condition.
Therefore, when the benchmarks are executed under the plural different conditions, there exists such a problem that an enormous amount of time and labor is required for a confirmation work on whether or not the measurement condition to be inputted is the same as the inputted measurement condition, in order to prevent the input mistakes of the measurement conditions, and the like. Further, when the input mistakes of the measurement conditions and the like are not detected in the confirmation work, there exists such a problem that the benchmark cannot be executed under a desired measurement condition.
Further, in the conventional benchmark, when a benchmark under a certain measurement condition is executed in an information processing system, the resultant performance evaluation information is individually (independently) saved and managed for each information processing system or for each benchmark. Hence, it is very difficult to search for and compare the performance evaluation information obtained from benchmarks executed in the past.
For example, suppose that the performance of the information processing system under a certain measurement condition is to be evaluated by referring to the performance evaluation information obtained from the benchmarks executed in the past. At this time, a person trying to refer to the performance evaluation information obtained in the past (a person evaluating the performance), manually searches performance evaluation reports and the like and extracts an example in which a benchmark is executed by a similar information processing system and under a similar measurement condition, depending on his/her own memory.
However, the contents (items) of the performance evaluation information and output forms thereof vary depending on tools for executing the benchmarks, and each writer of the performance evaluation report writes in a different manner. Moreover, an example in which the benchmark is executed in a similar information processing system and under a similar measurement condition cannot always be extracted. Therefore, the search and comparison of the performance evaluation information obtained from the benchmarks executed in the past is troublesome, and in addition, result in an extreme waste of time and labor of the person evaluating the performance.