Relational databases generally require that every record be uniquely identified by one or a combination of columns. The one or more columns that uniquely identify records is declared to be the primary key (PK) of the table.
A B+tree data structure is often used to manage database records. In an example implementation, nodes that are leaves of the B+tree are data pages of database records, and nodes that are parents of the leaves are index pages. The index pages contain primary key values for referencing records in the data pages. The leaves are sequentially linked to provide sequential access to database records.
In some applications a database management system (DBMS) can generally insert records into the database tables with no concurrency problems. For example, for a pre-assigned key such as a social security number (SSN), the order in which records keyed on the SSNs are presented for insertion into the database may be random. Thus, the random order in which pages are inserted into the B+tree minimizes concurrency issues.
In other applications, the primary key may be generated as a monotonically increasing value (e.g., 1, 2, 3) and data records inserted sequentially, which may restrict concurrency in some DBMSs. Some DBMSs cannot handle concurrent inserts of sequential records because the records are logically inserted on the right-most page of the B+tree, and the selected database recovery approach uses page level recovery rather than record level recovery. Thus, no more than one transaction at a time can insert a value to a given page.
Where restrictions on concurrency may pose a problem, a user may insert dummy records and then delete the records to create the index and data pages for later use in sequentially inserting legitimate records. The insertion and deletion of dummy records permits subsequent concurrent inserts because each subsequent insert will be directed to be stored on a different empty data page. However, inserting and deleting the required dummy records may be time consuming, error prone, and cause different performance and scaling problems.
A method and system that address these and other related issues are therefore desirable.