Two commonly used methods to scale a database are by a shared nothing approach and a shared disk approach.
To scale by a Shared Nothing Approach, the data is partitioned to multiple partitions, each partition contains a portion of the data and each partition is managed by a single dedicated database server. Scaling is achieved as partitioning reduces the data size for each server, and the usage of additional servers for the new partitions.
As within each server, multiple threads may operate concurrently, the server maintains a lock manager that coordinates between the threads. For example, if two threads are to update the same resource (such as a row or a block), the lock manager may grant update permission to the first thread, when the update of the first thread is done, the lock manager may provide the permission to the second thread. Without the coordination of the lock manager, both threads may update the same resource at the same time whereas these simultaneous updates may lead to erroneous result. If two threads are reading the same data, the lock manager may grant both threads read permission while at the same time prevent a third thread from updating the data. When the two threads complete the read process, the lock manager can provide the third thread the permission to update the data.
Within a database, locks can be granted over different database resources. Examples of such resources are database, table, partition, segment, block and row. In addition, locks can be of different types. For example:                Read Lock—to allow reading the content of a particular resource. This lock is sometimes called a Shared Lock meaning that multiple threads (or processes) can read (share) the resource data concurrently. This lock may prevent a different thread (or process) to update the content of the resource. For example, two threads receive a Read Lock on the same Row such that both threads are provided with the permission to examine the content of the Row concurrently. However, the third thread that requests a Write Lock on the Row is denied, or is being placed on wait until the reading threads released their Read Lock.        Write Lock—it may be provided to allow updating a particular resource or add a new resource to the database. A Write Lock may be provided to a particular thread (or process) such that other threads (or processes) are prevented from acquiring conflicting locks over the same resource. Examples of locks that conflict a Write Lock are a Read Lock and a different Write Lock. A Write Lock is also referred to below as a lock for update as it allows changing the content of a particular resource or adding a new resource to the database. A non limiting example for the usage of a Write Lock is the update of the content of a particular row in the database. Another non limiting example is adding a new row to the database, the process may involve a Write Lock over an empty row (first) and updating the content of the empty row with the new data (second).        Exclusive Lock—to allow a single thread (or a single process) the permission to operate (read and or write) on a resource. For example, a grant of an exclusive lock over a table would let only a single thread the permission to operate with the table data. Even if the thread is reading data, no other thread (or process) may read the data as long as the exclusive lock is granted.        
A request for a lock may be issued by a thread or a process that operates over a particular resource and with the purpose of maintaining the integrity of the data and avoiding the corruption of the data.
A thread or a process requesting a lock request can represent a user using the database, a session of the database, a particular Node in the cluster or any other type of process or object that operates over the database data or over a resource of the database.
A request for a lock over a resource by said thread or process can be done when there is a need to operate using the resource data and the type of the lock requested depends on the type of operation. Some non-limiting examples are as follows:                a) When a resource is read, a Read Lock may be used to prevent an update of the resource data while the resource data is being retrieved.        b) When a resource is updated, a Write Lock may be used to prevent other processes from reading or updating the resource data at the same time.        c) When a new resource is added, a Write Lock may be used to prevent other processes from adding or using the same resource at the same time. For example, users are adding data to a database, the database is organized such that the new data is placed in rows, and a Write Lock over an empty row may be used to prevent the usage of the particular empty row for different new data by a different thread or process. In this example, only the process that holds a grant for a Write Lock over an empty row in the database can use the empty row for new data. Once the data is placed in the row, the row is not empty and thus it cannot be considered as an empty row by a different process and even if the Write Lock is being released.        
When the thread or process completes operating the resource data, or when the thread or process determines that there is no need to maintain a lock over a particular resource, they may release the lock to allow other conflicting locks of other threads and processes to be acquired.
An explanation of locking types and compatibility is available from C. J. Date, An Introduction to Database Systems, Volume 1, Fourth Edition, Pages 422-424, section 18.5 Locking.
Within the database server software, the Lock Manager is the process that receives the lock requests (from the different threads or processes), analyzes the requests and provides (or denies) the grants.
In addition, the Lock Manager can place a request which can't be granted on a wait list and provide the grant later in time when the request is not conflicting with granted requests. For example, a thread receives a grant for a Write Lock request and a different thread that requests a Read Lock is placed on a wait list. When the first thread releases the lock, the Lock Manager grants the Read Lock to the second thread.
Within this application, we call the executing database server software with its virtual or physical machine a Database Node (or simply a Node). Note—we hereon use the term Node and “server” interchangeably.
A discussion on a Shared Disk Machines and a Shared Nothing Machines is also available by Hector Garcia-Molina, Jeffrey D. Ullman, Jennifer Widom at Database Systems The Complete Book, Second Edition, pages 986-989.
In a Shared Nothing system, all the threads operate within the same Node, and therefore share the same memory space. The locking requests of each thread may utilize the shared memory in the Node. For example, the request can be placed by the threads on a linked list which is processed by the Lock Manager.
In a Shared Disk Approach, the data is not partitioned. Multiple Nodes are connected to a shared storage and each of the servers can access (read and/or write) the shared data. A shared disk solution is implemented as a cluster of Nodes that share the same data. Scaling is achieved by adding computing power by means of adding Nodes which operate concurrently over the data. If more computing power is needed, additional database servers (Nodes) can be added to the cluster.
Compared to a Shared Nothing Approach, scaling is provided inherently by the database cluster—the database cluster is built such that more computing power can be added (by adding more database servers to the cluster). With a Shared Nothing Approach, scaling is done by the developer, as the data partitioning needs to be tailored by the developer. The partition of the data needs to achieve even distribution of the data among the servers, as well as even distribution of users among the servers. It also needs to consider the composition of the queries. For example, the partitioning should be done such that most queries would be satisfied on a single server. These are difficult tasks, as sometimes there is no good partitioning approach and it involves considerations which are different for each application. In addition, the type of queries may not be known in advance.
An example of the complexity imposed by a Shared Nothing approach is the following: (1) the database contains customers and catalog information. (2) Many of the queries are joining customers and catalog information. (3) To scale, the customer data is partitioned to multiple partitions. However, the developer is now facing the question of where to store the catalog. If the catalog is to be stored on a dedicated server—the queries that join the customers and the catalog info would need to retrieve information from two separate servers—from the server managing the customer information and the server managing the catalog information. If the catalog information is duplicated on each server, every update to the catalog information needs to be executed on each of the servers that maintain the catalog information.
For the developer, the Shared Disk Approach offers a simple automated way to scale. When large data sets are managed by a Shared Disk database, there is no need to partition the data. Multiple servers have read and write access to the data. To scale, more servers are added to the cluster. As each of the Nodes in the cluster may have concurrent processes, each of these servers may have a local lock manager similar to the lock manager described above with respect to the Shared Nothing Approach server. However, a database system that provides a Shared Disk Approach solution needs (in addition to synchronization of multiple threads within each server) to synchronize the lock requests at a cluster level as different nodes may have processes that are processing the same data at the same time. To synchronize the requests of different servers, a Distributed Lock Manager is used. This lock manager synchronizes requests from multiple nodes in a cluster. As the different servers don't have a shared memory space, the lock requests and grants are done by messages that are sent over a network. This process is by far slower than the process of the Local Lock Manager within a server that is based on a shared memory—a process of the lock manager based on a shared memory may run in nanoseconds whereas messages over network may be even measured in milliseconds (depending on the speed of the network).
There is accordingly a need in the art to reduce the time needed to satisfy lock processes that are done to manage a shared disk (and similar) database systems.
Glossary of Terms:
For clarity of explanation, there follows a glossary of additional terms used frequently throughout the description and the appended claims. Some of the terms are conventional and others have been coined:
Database and Database Management System (DBMS) is explained by: Hector Garcia-Molina, Jeffrey D. Ullman, Jennifer Widom at Database Systems The Complete Book, Second Edition. See also C. J. Date, An Introduction to Database Systems, Volume 1, Fourth Edition. See also http://endotwikipediadotorg/wiki/Database and http://endotwikipediadotorg/wiki/Database_management_system (database and Database Management System from Wikipedia).
Node—the Database Management System (DBMS) server software with its virtual or physical machine. We hereon use the term Node and “server” interchangeably. The Node executes the database management software. Two non limiting examples would be a computer server executing the Oracle database software and a computer server executing the IBM DB2 software. A Node provides management of data such that users are provided with the methods and tools (among others) to update and query the data. For example, a Node, executing an Oracle database software such that data manipulation requests (such as Insert, Update, Delete and Select) issued by users trigger processes that update and query data. Cluster (or a Database Cluster)—Multiple Database Nodes that provide management of the data. These Nodes may be connected via a network and may provide management of Shared Data. Shared Data—Data which is being managed by multiple Nodes.
A relational database is a database that conforms to relational model theory. The software used in a relational database is called a relational database management system (RDBMS). In a relational database, data is logically partitioned into tables and is organized as rows within the tables. The physical implementation may be different, for example, rows may be physically stored within blocks within files. Some of the resources of the database may be indexes which are not part of the relational model theory and which may be organized in blocks within files. The examples below may be provided using the logical organization of rows within tables or may be provides using some physical organization such as rows within blocks. However, the processes of the invention are not bound to a particular logical or physical organization.
Lock Manager—a process that receives lock requests from different processes or threads analyzes and manages the requests such that the integrity of data is maintained. In the context of the invention, a Lock Manager may issue particular locks, without explicit lock requests, when it determines that a particular process or a particular Node is interested with the particular lock.
Local Lock Manager (LLM)—a Lock Manager that analyzes and manages the lock requests of different threads (or processes) by utilizing a shared memory space. This Lock Manager and the requesting threads (or processes) exist within the same Node.
A discussion on locking is also available by Hector Garcia-Molina, Jeffrey D. Ullman, Jennifer Widom at Database Systems The Complete Book, Second Edition, Chapter 18, Concurrency Control, section 18.4 Locking Systems with Several Lock Modes pages 905-913.
Distributed Lock Manager (DLM)—a Lock Manager that analyze and manage lock requests of different threads (or processes) of different Nodes. This Lock Manager and the different threads communicate by sending messages over a network. The DLM manages a Cluster of Nodes. The management of a Local Lock manager or a Distributed Lock Manager (together with other processes) maintains the integrity of the data. Some databases maintain compliance to the ACID set of rules. ACID (atomicity, consistency, isolation, durability) is a set of properties that guarantee that database transactions are processed reliably. More details on ACID properties of a database are available at http://endotwikipediadotorg/wiki/ACID (From Wikipedia).
A DLM operates by monitoring the processes in the different Nodes. It analyzes lock requests that are issued from different processes (or Nodes), provides grants to non conflicting requests (for example multiple processes may read the same data at the same time) and sets an order among conflicting requests. For example, a Write Lock is provided to process A, a grant (for read or write request) for process B is delayed until the lock of process A is released.
SQL (from Wikipedia)—sometimes referred to as Structured Query Language is a programming language designed for managing data in relational database management systems (RDBMS). SQL was one of the first commercial languages for Edgar F. Codd's relational model, as described in his influential 1970 paper, “A Relational Model of Data for Large Shared Data Banks”. http://endotwikipediadotorg/wiki/SQL-cite_note-codd-relational-model-4
SQL became the most widely used database language. SQL is used also as a data manipulation language (DML) which is a family of syntax elements similar to a computer programming language used for inserting, deleting and updating data in a database. Storage Engine (from Wikipedia)—A Database Engine (or “storage engine”) is the underlying software component that a database management system (DBMS) uses to create, read, update and delete (CRUD) data from a database. Most database management systems include their own Application Programming Interface (API) that allows the user to interact with their underlying engine without going through the user interface of the DBMS.
Many of the modern DBMS support multiple database engines within the same database. For example, MySQL supports InnoDB as well as MyISAM and other Storage Engines.