A portion of the disclosure of this patent document contains material which is subject to copyright protection. The copyright owner has no objection to the facsimile reproduction by anyone of the patent document or the patent disclosure as it appears in the Patent and Trademark Office patent file or records, but otherwise reserves all copyright rights whatsoever.
A Computer Program Listing Appendix, containing one (1) total file on compact disc, is included with this application.
The present invention relates generally to database systems and, more particularly, to optimization of page allocation (e.g., of data pages or the like) in such systems.
The present invention relates generally to information processing environments and, more particularly, to optimization of page allocation (e.g., of data pages or the like) in a data processing system, such as a Database Management System (DBMS).
Computers are very powerful tools for storing and providing access to vast amounts of information. Computer databases are a common mechanism for storing information on computer systems while providing easy access to users. A typical database is an organized collection of related information stored as xe2x80x9crecordsxe2x80x9d having xe2x80x9cfieldsxe2x80x9d of information. As an example, a database of employees may have a record for each employee where each record contains fields designating specifics about the employee, such as name, home address, salary, and the like.
Between the actual physical database itself (i.e., the data actually stored on a storage device) and the users of the system, a database management system or DBMS is typically provided as a software cushion or layer. In essence, the DBMS shields the database user from knowing or even caring about underlying hardware-level details. Typically, all requests from users for access to the data are processed by the DBMS. For example, information may be added or removed from data files, information retrieved from or updated in such files, and so forth, all without user knowledge of underlying system implementation. In this manner, the DBMS provides users with a conceptual view of the database that is removed from the hardware level. The general construction and operation of a database management system is known in the art. See e.g., Date, C., An Introduction to Database Systems, Volume I and II, Addison Wesley, 1990; the disclosure of which is hereby incorporated by reference.
DBMS systems have long since moved from a centralized mainframe environment to a de-centralized or distributed environment. One or more PC xe2x80x9cclientxe2x80x9d systems, for instance, may be connected via a network to one or more server-based database systems (SQL database server). Commercial examples of these xe2x80x9cclient/serverxe2x80x9d systems include Powersoft(trademark) clients connected to one or more Sybase Adaptive Server(trademark) database servers (formerly, Sybase SQL Server(trademark)). Both Powersoft(trademark) and Sybase Adaptive Server(trademark) are available from Sybase, Inc. of Emeryville, Calif.
To support multiple clients concurrently accessing data, a database system must employ some sort of concurrency control. Concurrency control is the way that a database synchronizes clients"" or users"" access to data to ensure that they do not destroy each other""s work. Locking is the way that most databases handle concurrency control. While one user has a lock on a subset of the database, that data is xe2x80x9cblockedxe2x80x9d or protected from other users"" attempts to make changes to it. One user""s changes have to be xe2x80x9ccommittedxe2x80x9d before another user can see or further change the same data.
As the migration to client/server continues, each day more and more businesses are run from mission-critical systems which store information on server-based SQL database systems, such as Sybase Adaptive Server(trademark). As a result, increasingly higher demands are being placed on server-based SQL database systems to provide enterprise-wide decision support. Accordingly, there is much interest in improving the performance of such system, particularly in terms of execution speed and reliability.
Increasingly, database servers are deployed on multi-processor computers, such as Hewlett-Packard""s V-class UNIX-based machines (e.g., HP V2500 RISC-based computer) running on a 64-bit operating system (e.g., HP UNIX). In such a configuration, multiple instances of the database server software are run, one instance (e.g., UNIX process) per processor, all operating on a single shared memory (e.g., 32 GB of main memory). Expectedly, as the number of processors is increased, the contention on the single shared memoryxe2x80x94a critical system resourcexe2x80x94also increases. As soon as contention for the shared memory arises, each of the multiple processes will encounter wait states, such as xe2x80x9cspinningxe2x80x9d (e.g., consuming CPU cycles while waiting availability of a spin lock), or will stall in order to wait for its turn for access to the shared memory. As a result, overall system scalability and hence throughput suffer.
Thus today, the contention on shared memory poses a bottleneck to true scalability for database servers running on multi-processor computers. Moreover, the problem is particularly acute in environments employing database servers for online transaction processing (OLTP), such as e-commerce applications, where contention is exacerbated by a high number of concurrent transactions. What is really desired is for the database system to scale in relation (e.g., linearly) to the number of processors added, so that system throughput scales accordingly. The present invention includes optimization techniques that assist in reaching that ideal goal.
In a multiuser database environment with a large number of clients performing a large number of transactions (e.g., inserts, updates, and deletes), such as an OLTP application executing hundreds of transactions per minute for providing e-commerce support, a huge number of database page splits occur. In such a scenario, the system must frequently access the database""s allocation pages in order to update the information pertaining to page-management housekeeping. Each allocation page is protected in this concurrent environment using a latch. Since the conventional approach is for the database system to traverse a chain of allocation pages and grab the first available allocation page, the net result is that the latch protecting the allocation page becomes a single point of contention in a high-volume transaction environment.
The present invention comprises a Client/Server Database System with improved page allocation methodology, particularly applicable in a multiprocessor database configuration. In an exemplary embodiment, the system includes one or more Clients (e.g., Terminals or PCs executing client database access software) connected via a Network to a Server. The Server, operating under a server operating system (e.g., UNIX), includes a Database Server System, such as Sybase Adaptive Server(trademark). In general operation, Clients store data in and retrieve data from one or more database tables resident on the Server by submitting SQL commands, some of which specify xe2x80x9cqueriesxe2x80x9dxe2x80x94criteria for selecting records of a table for a particular operation (e.g., updating). Queries that modify data are typically provided using a data manipulation language (DML), such as using the DML capabilities of the well-known SQL syntax.
The present invention introduces a methodology for optimized page allocation, using the following approach. During page allocation, once an allocation page with free space has been located in the system""s global allocation map or GAM (i.e., using routine page allocation steps), the page identifier for that allocation page is stored in a hint array, as part of that object""s (i.e., table""s) object descriptor or des. For a table undergoing a lot of splits (i.e., insert-intensive object), the system may store an array of allocation page xe2x80x9chintsxe2x80x9d (allocation page identifiers) in the des for that object (e.g., table). The array itself comprises a cache of some number of slots (e.g., eight slots), each of which stores an allocation page identifier (xe2x80x9chintxe2x80x9d) obtained from the GAM (from a GAM traversal occurring during the page allocation process) or is empty (i.e., has not been filled from the GAM and is therefore set to the initial value of null). For example, the first slot may store the page identifier for one allocation page. A second slot may store the page identifier for another, completely different allocation page, and so forth and so on. On subsequent passes through the page allocation process, the system can, rather than going to the GAM, randomly select (e.g., randomly hash on) a particular slot of the cache. The random hash may be done in a conventional manner, such as using the Standard C rand function; see e.g., Microsoft Developer Network Library/Visual Studio 6.0, Part Number X03-55262, available from Microsoft Corp. of Redmond, Wash., the disclosure of which is hereby incorporated by reference. In this manner, the incoming clients will, instead of competing for the same first-available allocation page, randomly select among multiple available allocation pages. Since each allocation page itself is protected by a separate latch, the system is able to decrease contention during the page allocation process by randomly accessing different elements of the xe2x80x9chintxe2x80x9d array. In this manner, the system can avoid the computationally-expensive process of page allocation that is usually required as well as avoid contention for the first-available allocation page.