In recent years, parallel operation and decentralization of databases have been encouraged for the purpose of improvement in performance or reliability. FIG. 22 shows a typical example of a configuration of a conventional database system.
The performance of the conventional database system in searching a database is improved by creating a plurality of replica databases 104 relative to one master database 103. A front-end server 102 decentralizes queries and poses them to the replica databases 104. Otherwise, one database may be regarded as a use database and the other may be regarded as a standby database. The databases may be switched in case of a fault. Reliability may thus be improved.
In the past, as a method for decentralizing queries and posing them to a plurality of servers, a method of determining a server, to which a query is assigned, according to the round robin or a method of measuring a load such as a CPU use rate and assigning a query to a server that incurs a light load has been adopted.
For example, the number of resources used by each batch job is calculated. When a plurality of jobs must be performed, a sum total of resources is calculated. If the sum total of resources exceeds the number of resources permitted by a server, no new job is submitted in order to avoid contention for resources. This method of scheduling batch jobs has been proposed (refer to, for example, Patent Document 1, JP-A No. 311795/1997).
In order to further improve the performance of the database system, it is important to avoid contention for resources such as contention for a database buffer or a disk among the replica databases 104.
Contention for a database buffer (cache) will be described in conjunction with FIG. 23 and FIG. 24 below. A disk 201 is connected to a server 200, and three tables (205, 206, and 207) are stored in the disk 201. When a query 1 (220) for requesting data in table 1 (205) is posed to the server 200, required data is transmitted from table 1 (205) in the disk 201 in response to the query.
Inputting or outputting data to or from a disk (disk inputting/outputting) requires more processing time than reading or writing data to or from a memory. A memory is therefore adopted as a cache for inputting or outputting data to or from a disk. The cache (database buffer) is a portion of a memory 203, and divided into data pages 204 having a storage capacity of several kilobytes. The data pages 204 are organized according to the least recently used (LRU) algorithm. Data used frequently is left in the memory 203.
If a query 1 (220) for data of table 1 (205) is issued, the database buffer is searched. If the required data of table 1 (205) is stored in the database buffer in the memory 203, a result can be obtained without the necessity of disk inputting/outputting.
On the other hand, as shown in FIG. 24A, if a query 2 (221) for data of table 2 (206) is posed to the server 200, since required data is not stored in a memory 300, disk inputting/outputting is performed. As shown in FIG. 24B, a portion of the memory 301 is overwritten with the data of table 2 (206), that is, a queried content.
For example, assume that two queries for requesting large results whose contents are different from each other are continuously processed. In this case, the database buffer in the memory 300 is overwritten in response to the queries. Every time a query is processed, disk inputting/outputting takes place. In contrast, when queries request a common content, the possibility that data stored in a cache (database buffer) may be usable is high. The result can be obtained with little disk inputting/outputting. Thus, queries have compatibility. The performance of a database system varies depending on the order of posing queries.
Methods for avoiding contention for a database buffer (cache) include a method according to which a plurality of queries that requests the same data shares the same database buffer. For example, assume that two different queries request the same data, and the data is larger than the storage capacity of a database buffer. In this case, when the first half of data has been read in response to the first query, if the second query (query 2) is issued, the first half of data may have already been deleted from the database buffer. In other words, if data requested with a query is large, the first half of data may be overwritten with the second half thereof being read in response to the first query (query 1).
In this case, the first half of data is reread from a disk in response to the query 2, and the database buffer is overwritten with the read data. However, the data overwritten with the read data (second half of data read in response to the query 1) is data also requested with the query 2. Unnecessary disk inputting/outputting takes place. In order to cope with this problem, a method of reading data not from the beginning but from the same point as the point, from which reading is started in response to the query 1, during the second reading, while using the buffer in common between the queries 1 and 2 has been implemented in the SQL server that is a database management product from Microsoft Corp. (refer to, for example, Non-patent Document 1,    http://msdn.microsoft.com/library/default.asp?url=/libra ry/en-us/dnsql2k/html/thestorageengine.asp).
Methods for effectively utilizing other buffer include a method of dividing a cache into fields and associating the fields with the levels of importance of data. An Oracle that is a database product from Oracle Corp. divides a buffer into three fields of keep, default, and recycle fields, and allows a user to designate a size or a management method for each field. For example, data that must reside in a buffer is stored in the keep field, data that may be overwritten is stored in the recycle field, and the other data is stored in the default field. Thus, the buffer is divided into fields. For data allocation, the property of data must be well understood. A setting must be modified every time a data size or a system configuration is changed.
In a decentralized/parallel computer environment, an increase in the cost of operation as well as performance poses a critical problem. If the number of computers to be managed increases, the cost of operation increases.
In business employing an information technology (IT) system, a system failure leads to a great loss. It is therefore essential that the system should be operated on a stable basis. For example, in order to avoid a system failure derived from concentration of accesses on the system, servers are added according to the number of accesses.
In an environment in which a configuration or setting is changed frequently, it is not easy to manage things according to the change. Every time the environment changes, if each computer is manually tuned or an environmental setting is manually optimized, the costs of operation and management increases. Methods of lowering, the costs of operation and management include automatic tuning. The SQL server from Microsoft Corp. and other products permit automatic tuning of parameters representing the settings of each database. However, the automatic tuning does not fully cope with the parallel/decentralized environment.
Other methods of lowering the cost of operation include a method of consigning the whole or part of operation and management to an outside management service provider (MSP) who contracts to manage an IT system. For example, a monitoring service provided by such an MSP includes monitoring of performance information such as a CPU use rate of a server, and sounding of an alarm if a predetermined threshold is exceeded.