The present invention is related generally to software and computer programs. More specifically, the present invention is related to software for sizing and specifying database management system hardware.
Relational databases came into common use in computers over twenty years ago. Despite improvements in database software and new methodologies, relational databases remain the mainstay of database management systems. Hardware vendors originally supported proprietary database management systems which ran primarily on machines manufactured by the hardware vendor. Software developers later developed database management systems that were more open and ran on computers made by several vendors. The database management systems were also ported to run under various operating systems. This gave the advantage of spreading the cost of development over more sites and also uncoupled the dependence between hardware vendors and software vendors. Third party support and training became more common.
Database management systems also became separated into client-side software and server-side software. This meant that the server-side software was decoupled from software having to do with the display, use, and formatting of the data received from the database. In particular, server-side software often handled mostly queries of existing data along with updates of existing data and insertion of new data.
Modem electronic commerce, such as commerce over the Internet or business-to-business electronic commerce, has placed increased demands on many servers. This has also made frequent upgrades necessary. Company mergers and acquisitions frequently make it necessary to incorporate large amounts of data from unexpected sources. Customer expectations also make it necessary to upgrade hardware to keep up with the faster response times users expect even though system loads may be increasing as well.
When upgrading or replacing database servers, it is necessary to have a good idea as to the size of the database that will have to be implemented on the new server. The data storage as well as storage of many different indices will all increase the amount of data required. It may be necessary to come up with a good estimate of the required amount of mass storage in a short time period, as during bid evaluations, during sales presentations, or repeatedly during scenario building. The person supplying the input may have only a rough idea as to what the database mass storage requirements are.
In some situations, a required or desired transaction handling capability is based mainly on the capability of a known system. It may be the case that a given brand name DBMS server is believed to satisfy a current or future requirement, and the tpmC (a standardized transactions per second benchmark) capability of that DBMS server is available from the vendor and/or from the tpmC organization database. A better performance value could, in theory, be derived from a series of more specific user supplied information or requirements. It may be the case that the user has a more specific idea about what the requirements are for a system, such as detailed transaction specific information.
If a system were created for determining hardware requirements for a DBMS, it would be desirable to allow for some margin or headroom in capacity. If a system was specified to run too close to capacity, then it might fall behind and deny service when bursts of activity exceeded the average workload. It might be desirable to factor in hardware utilization limits for hardware components. An upper utilization limit could reduce the likelihood of over-capacity and the resulting bottleneck. A lower utilization limit could reduce the likelihood of under-capacity and the resulting excessive spending.
What would be desirable are methods for calculating the data storage requirements for a relational database table not requiring interatively calculating the number of number of index levels, the number of data pages, the number of index pages, and the data storage requirements for relational database systems, capable of using either estimated or detailed requirements as input. What would be useful are methods capable of determining required server size based on either required transaction processing benchmarks on more detailed SQL processing requirements, while factoring in hardware utilization limits. A system having capabilities for sizing, specifying, and comparing DBMS systems all in the same software package would be desired.
The present invention provides a computerized method for use by a human user for sizing, specifying, and comparing hardware for database management systems. The present invention includes sizing, specifying, and comparison functions all in one commonly accessible software package which can share information between the various functionalities. The computerized method preferably includes software executable on a general purpose computer by an end user.
The software can accept a variety of inputs from a user in order to size and specify the hardware required to run the DBMS software. The inputs can include a DBMS software system name selectable from a list of DBMS software systems. Such a list can include, for example, DBMS software systems including SQL Server and Oracle, including various versions or releases of these systems. The inputs can also include required database input parameters and required transaction workloads. In one system, the mass storage is calculated in terms of gigabytes of disk space needed to hold the database tables and indices required to handle the database specified by the user inputs. The present invention can then calculate the required database mass storage size as a function of the database input parameters. The computerized system can also determine the hardware required to handle the transaction processing rate. The required hardware can include the number of processors, the processor speed, and the number and speed of network interface cards.
In one system, the input parameters include estimates such as a number of tables, a total amount of data, an average number of columns per row, an average row size, a percent variable length columns, and an average size of variable length columns per table. In one system, the database input parameters include page size, a fill factor, an average number of non-clustered indexes per table, an average number of fixed length fields per non-clustered index, an average number of clustered indexes per table, and a number of variable length fields per cluster index.
In one aspect of the invention, the transaction workload is specified at least in part by a required transactions per minute value. One software portion can specify the transaction workload at least in part by specifying a composition for a plurality of transactions. In one software portion, a plurality of SQL statements is specified for each of the transactions. In another software portion, each SQL statement includes at least one SQL parameter such as the number of records the SQL statement operates upon.
In systems accepting either detailed transaction compositions or higher level workload compositions, the workload can be determined and output in a standard format such as the tpmC format used by the Transaction Processing Performance Council (TPC) organization. The tpmC values measuring the workload can be used to call up various systems having data in a database such as the TPC-C database. The comparator portion of the system can then be used to compare two systems to each other, comparing the two tpmC values and comparing the two prices and/or price per performance values.