With the advent of technological advancements, huge amount of digital data is generated every day from a variety of sources. These sources can be the companies/firms/corporations/government bodies/banks/retail chains involved in the online and offline business which utilizes technology as a part of their business. These sources want to analyze the data on a regular basis in order to ensure a continuous and smooth running of systems as well as to have in-depth insights. This kind of data is known as big data which has become a latest trend across many areas of business and technology.
In general, the data is processed by traditional processing systems for performing one or more tasks. The existing database systems keep a track of the data, store the data, and continuously update the data in regular intervals of time and so on. These database management systems handles millions of transactions and millions of requests on day to day basis. These database management systems employ complex algorithms to look for repeatable patterns while handling the data with an extended amount of metadata. Furthermore, these database management systems are employed in various sectors including banking sector, e-commerce sector, industrial sector and the like which require continuous processing of the data in order to ensure smooth running of business.
Moreover, these database management systems manage the data efficiently and allow users to perform tasks with utmost ease. In addition, the database management systems increase the efficiency of business operations and overall costs. Further, the database management systems are located in a non-volatile storage of various systems. Examples of the database management systems include Oracle, Microsoft SQL Server, Sybase, Ingress, Informix and the like. The database management system is stored on a database associated with a disk storage system such as a hard disk drive or solid state drive. The database is an organized collection of schemes, tables, queries, reports, views and other objects.
In general, the database is an organized collection of schemes, tables, queries, reports, views and other objects. The database can be stored on a stable storage system like a hard disk and placed in a remote or local area along with a network. The database can be accessed, edited and updated through a server that transfers the requests to the processor associated with the database. Moreover, the database management system handles request from various client machines. In addition, the database management systems make one or more changes in response to the requests.
Going further, the database management systems store the data in memory for continuous use in future. Moreover, as technology and computing is evolving, need for more refined data, memory and process handling algorithms and techniques are required at developer end to bridge the gap of inefficient, delayed, failed transfer and commit of records in the database. The huge amount of data needs to be stored in the database and at the same time required for future purposes also in order to increasing performance of applications. However, there is a limit to an amount of the data that can be stored depending on the memory space in the database due to which some amount of data is consistently flushed from the disk to intake new data. This problem is addressed by using a mechanism for caching the data into a volatile memory associated with the database.
The data is consistently cached into a random access memory for temporary storage of the data. Moreover, the data is cached for responding to requests from the client machines swiftly by reading the data pre-stored in the random access memory. This data corresponds to a recently accessed data by the users. In addition, the random access memory includes one or more buffer pools for reading and storing the data. As known in the art, a buffer pool is a place in system memory or a disk that is used for caching table and index data pages as they are modified or read from the disk. Further, the buffer pool caches disk blocks to optimize block I/O. Furthermore, the primary purpose of the buffer pool is to reduce database file input/output (I/O) and improve the response time for data retrieval. The database writes the data in form of pages into the buffer pool.
Typically, only clean pages are written into the buffer pool for minimizing the risk of data loss. In addition, the buffer pool may be associated with a single database and may be used by more than one table space. Moreover, the buffer pool space is allocated based on the requirement of the user. Further, an adequate buffer pool size is essential for good database performance as it reduces disk I/O which is the most time consuming operation. Large buffer pools also have an effect on query optimization as more of the work can be done in memory.
Going further, the buffer pools take decision of flushing pages from the memory to the database when the size of the data pages stored in the memory increases the size of the buffer pool. Moreover, the buffer pool is configured for keeping only a relevant portion of the data in the buffer pool. In addition, it is important that access to the buffer pool should be easy. In simpler terms, the buffer pool should be highly concurrent for allowing multiple users to access the pages simultaneously. However, if the access to the buffer pool is not smooth, efficient or fast, then the seek time for accessing the pages is considerably increased even though the pages are present in the buffer pool due to pre-fetching techniques.
Furthermore, it is highly essential that the buffer pool allows multiple users to access different data and serve that data to the users in a parallel fashion. Moreover, data structure of the buffer pool should be such that overhead of managing the overall buffer pool is minimum. In addition, the buffer pool maintains a table in the memory whose size is computed when the database is started based on certain parameters. Moreover, the buffer pool contains multiple lists namely least recently used list, dirty page list and a free list. The multiple lists handle the access to the pages stored in the buffer pool, writing on the pages and flushing the pages back to the database. Further, the buffer pool manages the pages by using some metadata and keeping the metadata in a header. The headers are linked in the table or the list recently used list or the free list or the dirty page list. Moreover, the metadata contains location information of the page, an offset of a file system for the data page or index page, a block number and the like.
Further, the memory of the buffer pool is divided into various slots for managing the pages using the page size in a sequential manner. The metadata header is linked to the slots in the buffer pool. The access to the pages is only possible through the table created by the buffer pool and then from the header and then the pages are accessed through the header. Moreover, the table created by the buffer pools gets locked when a particular user access any page which restricts other users to read any different page. In addition, a particular slot in the table corresponding to the page gets locked as well. Further, the slot contains multiple other headers which are also locked when the page is accessed from a particular header. For example, if a user A is accessing a page 1 and header for the page 1 is linked with a slot 30, the slot 30 is locked and the user A gets access to the page 1 but a user B trying to access a page 10 is not allowed to access the page 10 is contained in the slot 30 which is presently locked and the user B has to wait until the slot 30 is unlocked.
In addition, the efficiency of the buffer pool is decreased by a great extent. Moreover, the concurrency of the buffer pool is low which decreases the parallel processing and leads to degradation in performance of the buffer pool. Further, the locking of the table and the slots does not allow multiple users to access different pages simultaneously. Furthermore, the present systems and methods do not differentiate between when the user wants to write on the page and when the user wants to read the page which decreases the performance of the buffer pool. In addition, the present systems and methods do provide a lockless table for allowing multiple users to access the pages simultaneously.
In light of the above stated discussion, there is a need for a method and system that overcomes the above stated disadvantages and provides a more concurrent buffer pool.