1. Technical Field
This invention generally relates to query governors in a computer database system, and more specifically relates to a database query governor for a parallel computer database system.
2. Background Art
Databases are computerized information storage and retrieval systems. A database system is structured to accept commands to store, retrieve and delete data using, for example, high-level query languages such as the Structured Query Language (SQL). The term “query” denominates a set of commands for retrieving data from a stored database. The query language requires the return of a particular data set in response to a particular query.
Optimization and execution of a database query can be a resource-intensive and time-consuming process. Further, the larger the database, the longer the time needed to execute the query. In order to prevent an excessive drain on resources, many databases are configured with query governors. A query governor prevents the execution of large and resource-intensive queries by referencing a defined threshold. If the cost of executing a query exceeds the threshold, the query is not executed. The query governor has a configuration file that determines the databases that an instance of the governor monitors and how it manages it.
Many large institutional computer users are experiencing tremendous growth of their databases. One of the primary means of dealing with large databases is that of distributing the data across multiple partitions in a parallel computer system. The partitions can be logical or physical over which the data is distributed. Prior art query governors have limited features when used in parallel computer systems. The query governors do not consider network resources of multiple networks in a parallel system with a large number of interconnected nodes.
Massively parallel computer systems are one type of parallel computer system that have a large number of interconnected compute nodes. A family of such massively parallel computers is being developed by International Business Machines Corporation (IBM) under the name Blue Gene. The Blue Gene/L system is a scalable system in which the current maximum number of compute nodes is 65,536. The Blue Gene/L node consists of a single ASIC (application specific integrated circuit) with 2 CPUs and memory. The full computer is housed in 64 racks or cabinets with 32 node boards in each rack.
The Blue Gene/L supercomputer communicates over several communication networks. The 65,536 computational nodes are arranged into both a logical tree network and a 3-dimensional torus network. The logical tree network connects the computational nodes in a tree structure so that each node communicates with a parent and one or two children. The torus network logically connects the compute nodes in a three-dimensional lattice like structure that allows each compute node to communicate with its closest 6 neighbors in a section of the computer.
On parallel computer systems in the prior art, the query governor is not able to effectively control the total use of resources across multiple nodes with one or more networks. Without a way to more effectively govern queries, computer systems administrators will continue to have inadequate control over database queries and their use of system resources.