Current server software architectures fall into one of two categories relative to their efforts to improve on performance and scalability. In particular, existing database applications in the market, such as Oracle, SQL Server, DB2, PostgreSQL (object RDBMS), MySQL (RDBMS) and Firebird (database server), employ one of the following two architectures: (a) an architecture using a process or thread per user session (hereinafter “process/thread per session architecture”); and (b) an architecture using a process pool or thread pool for handling all of the user sessions (hereinafter “process/thread pool architecture”).
The process/thread per session architecture, wherein a process or thread is utilized per user session, is more commonly used in database systems. A schematic block diagram of a model 100 of this conventional architecture is shown in FIG. 1. According to the process/thread per session architecture, a process or thread is spawned each time any user logs in and lasts until the user has logged out. All requests from the user are undertaken in the spawned process or thread. The process/thread per session architecture has various limitations, drawbacks, and/or disadvantages.
For example, in the process/thread per session architecture, process creation is costly and is not resource-efficient, especially in online transaction processing (OLTP) systems. When a number of users are logged in, the system would be running a corresponding number of processes and hence a large amount process switching overhead is required by system resources compared to that actually employed to perform useful work during times of normal or average usage.
To ameliorate these limitations, a connection pool is typically provided at the application server level, i.e., a cache of database connections is maintained by the database so that the connections can be reused when the system needs to attend to future requests for data. With this technique, a user's OLTP need not be executed by the same process or thread every time but the result is that data cannot be cached in memory at the database level between two requests in a particular session. The data can still be cached in a memory area, accessible to all the sessions, but then it introduces synchronization overheads to access/modify the data. As an example, if the user keeps sending the same query repeatedly, it might be reasonable for the system to store the query plan and any relevant data in that user's own session memory, which can be accessed without any synchronization. However, such memory area (unique to a session) is not available at the database server level, since each request might go to a different process or thread.
A server architecture might be designed such that data is stored at the application server level memory, but this could result in too much caching at the application server level and thus in a considerable garbage collection overhead. One conventional method for avoiding heap management or garbage collection overhead is by pre-allocating pools of memory and using a custom, lightweight scheme for allocation/de-allocation. However efficient the garbage collection algorithm is, it is designed only for short-lived small objects and not for persistent information or long-lived caches. Because application server garbage collection is designed with short lived objects in mind, if such a heap is being used as a cache for database connections, then the overhead will be considerable. Moreover, some of the data-like query plans are known to be very complex to be cached at the application server level. These query plans are also known as “query execution plans,” i.e., a set of steps used to access or modify information in an SQL RDBMS. The plan basically indicates whether to do an index scan or a full table scan to get the query results.
Another disadvantage of the connection pool at the application server end is that the full utility of temporary tables cannot be exploited. Any insert/update in the database will result in recovery log overhead. This overhead is incurred to safeguard data integrity so that when the database crashes, the data can be recovered. However, for temporary cached data as discussed above, logging for crash recovery is not required and it is not necessary to provide for the typical robust log file structure to redundantly record the database update operation. In fact, ISO standards approve a less robust temporary table structure which need not be recovered after a database crash. Nevertheless, the temporary table will become tied to the database server process in this case and the connection pool does not guarantee allocation of the requests from the same user to the same database server process or thread. Hence, any temporary table activity is restricted to use within a request and cannot be spawned across requests.
Additionally, it is difficult to correlate actual system resources to the process/thread per session architecture. As every user login creates a process or thread, the number of execution units is not under the control of the database architecture and hence might result in inefficient utilization of the system resources in delivering the optimal output. One might try to estimate the number of concurrent users in order to produce a well-balanced system, but accurate estimation of concurrency depends heavily on heuristics and user behaviors which often vary from time to time.
The process/thread pool architecture, wherein a process pool or thread pool handles all user sessions, will now be discussed. A schematic block diagram of a model 200 of such an architecture is shown in FIG. 2. The process/thread pool architecture is based on the Half-Sync/Half-Async design pattern which is commonly implemented in application servers and supported in some database servers. According to the process/thread pool architecture, a pool of processes or threads takes requests from a queue one by one and executes each of them to completion. The process/thread pool architecture might also need a pool of processes or threads to communicate with the clients. The process/thread pool architecture has various limitations, drawbacks, and/or disadvantages.
For example, if there are a few long-running queries, the queries will occupy the processes or threads for a long time making them unavailable for smaller queries. Moreover, if the process pool or thread pool was created with a predetermined maximum number of processing units in mind, then the processing units may be under-utilized during the I/O service time of the requests. This disadvantage is especially true for a database server architecture where a lot of I/O is involved.
It is also difficult to build in an option to increase the hit ratio on the data cache and instruction cache whenever such opportunity presents itself in the process/thread pool architecture. Furthermore, the concurrency requirement in the process/thread pool architecture is usually handled in the same way as in the previously discussed process/thread per session architecture. Thus, there is a chance that all the shared servers might be waiting to acquire a lock or enter a monitor such that the overall system resources are under-utilized.
Another disadvantage of both the process/thread per session architecture and the process/thread pool architecture, which handle concurrent requests from different user sessions, is that such requests are executed separately.
In the context of a database server 300, as shown in FIG. 3, concurrent requests 302 from different user sessions are processed. For example, the concurrent requests 302 are n requests including a first request 304, a second request 306, a third request 308, a fourth request 310, and so on up to the nth request 312. Each of the n requests 302 involve inserting a record into the database. As noted above, in conventional architectures, the requests 302 are individually executed by the database server 300.
Thus, for each of the requests 302, the database server 300 must (i) find free storage (e.g., disk) space for the record to be inserted and (ii) write the record into the storage space. Steps (i) and (ii) are well known in the art. The database server 300 might also need to (iii) obtain a sequence value to be used to identify the inserted record within an index and (iv) enter the sequence value in the index. In step (iii), the sequence value is typically obtained from a counter that generates values based on an initial value and an offset value. For example, a counter with an initial value of 100 and an offset value of 2 will return the sequence values 100, 102, 104, 106, etc. As requests for sequence values may be processed concurrently by the counter, step (iii) typically involves a compare-and-swap (CAS) or similar instruction to insure that the sequence values are calculated based on up-to-date information.
B+ Tree indexes on such obtained sequence values are common as these values usually constitute the primary keys of a database table. For example, in an order processing system an “OrderId” field of a table stores sequence values representing unique order numbers. An index (e.g., in the form of a B+ Tree) is created (and updated via step (iv)) on the order numbers to facilitate searching the table. Suppose n orders originate at the same time for the order processing system, then there will be n index entries for the orders, each index entry being an order number and pointing to a corresponding record in the table. FIG. 4 is a diagram illustrating the structure of a B+ Tree index 400 according to a conventional server software architecture.
The database server 300 would need to separately perform the 4 steps or operations (i.e., steps (i), (ii), (iii), and (iv)) for each of the n requests 302. So, in the case of 10 concurrent requests, 40 separate operations would be required.
In general, as shown in FIG. 3, for the n requests 302, 4 times n separate operations are required (or 2 times n if the sequence values are not obtained and maintained in an index). For the first request 304, an operation 314 to find free space for the record to be inserted is performed, an operation 316 to write the record into the free space is performed, an operation 318 to obtain a sequence value is performed, and an operation 320 to update an index (e.g., the index 400) with the sequence value is performed. For the second request 306, an operation 322 to find free space for the record to be inserted is performed, an operation 324 to write the record into the free space is performed, an operation 326 to obtain a sequence value is performed, and an operation 328 to update the index with the sequence value is performed. For the third request 308, an operation 330 to find free space for the record to be inserted is performed, an operation 332 to write the record into the free space is performed, an operation 334 to obtain a sequence value is performed, and an operation 336 to update the index with the sequence value is performed. For the fourth request 310, an operation 338 to find free space for the record to be inserted is performed, an operation 340 to write the record into the free space is performed, an operation 342 to obtain a sequence value is performed, and an operation 344 to update the index with the sequence value is performed. Likewise, for the nth request 312, an operation 346 to find free space for the record to be inserted is performed, an operation 348 to write the record into the free space is performed, an operation 350 to obtain a sequence value is performed, and an operation 352 to update the index with the sequence value is performed.
There is often considerable similarity between requests that arrive concurrently (e.g., the requests 302). However, neither the process/thread per session architecture nor the process/thread pool architecture efficiently exploits this similarity.