1. Field of the Invention
The present invention relates generally to data processing and, more particularly, to temporary data management in shared disk cluster configurations.
2. Description of the Background Art
Computers are very powerful tools for storing and providing access to vast amounts of information. Computer databases are a common mechanism for storing information on computer systems while providing easy access to users. A typical database is an organized collection of related information stored as “records” having “fields” of information. As an example, a database of employees may have a record for each employee where each record contains fields designating specifics about the employee, such as name, home address, salary, and the like.
Between the actual physical database itself (i.e., the data actually stored on a storage device) and the users of the system, a database management system or DBMS is typically provided as a software cushion or layer. In essence, the DBMS shields the database user from knowing or even caring about the underlying hardware-level details. Typically, all requests from users for access to the data are processed by the DBMS. For example, information may be added or removed from data files, information retrieved from or updated in such files, and so forth, all without user knowledge of the underlying system implementation. In this manner, the DBMS provides users with a conceptual view of the database that is removed from the hardware level. The general construction and operation of database management systems is well known in the art. See e.g., Date, C., “An Introduction to Database Systems, Seventh Edition”, Part I (especially Chapters 1-4), Addison Wesley, 2000.
Presently, there are three dominant architectures for building multi processor high performance transactional database systems:
Shared Everything (SE)—In this architecture multiple processors of a single computer system share a common central memory and share the same set of disks.
Shared Nothing Cluster (SNC)—In this architecture neither memory nor any peripheral storage is shared among multiple compute systems.
Shared Disk Cluster (SDC)—In this architecture, multiple computer systems, each with a private memory share a common collection of disks. Each computer system in a SDC is also referred to as a Node.
Sybase, for example, offers a database system, known as Sybase® Adaptive Server® Enterprise (ASE), which is based on SE architecture and also referred to as SMP system. A SMP (Symmetric Multi-processing) system is a computer architecture that provides fast performance by making multiple CPUs available to complete individual processes simultaneously (multiprocessing).
Of particular interest herein are distributed SDC environments. In recent years, users have demanded that database systems be continuously available, with no downtime, as they are frequently running applications that are critical to business operations. Shared Disk Cluster systems are distributed database systems introduced to provide the increased reliability and scalability sought by customers. A Shared Disk Cluster database system is a system that has a cluster of two or more database servers having shared access to a database on disk storage. The term “cluster” refers to the fact that these systems involve a plurality of networked server nodes that are clustered together to function as a single system. Each node in the cluster usually contains its own CPU and memory and all nodes in the cluster communicate with each other, typically through private interconnects. “Shared disk” refers to the fact that two or more database servers share access to the same disk image of the database. Shared Disk Cluster database systems provide for transparent, continuous availability of the applications running on the cluster with instantaneous failover amongst servers in the cluster. When one server is down (e.g., for upgrading the CPU) the applications are able to continue to operate against the shared data using the remaining machines in the cluster, so that a continuously available solution is provided. Shared Disk Cluster systems also enable users to address scalability problems by simply adding additional machines to the cluster, without major data restructuring and the associated system downtime that is common in prior SMP (symmetric multiprocessor) environments.
Most database servers implement some scheme for managing temporary data, which is often required for storing intermediate results during query processing (e.g., sorting of data) or procedural/application processing. In SMP, temporary data management is provided in the form of temporary tables and temporary databases. Also in SMP, unlike other databases, there is no transactional recovery of a temporary database in the event of failure of the system. At every restart of the SMP system, a temporary database is recreated afresh and all its contents from the previous life cycle of the SMP system are lost. This scheme is not only well suited for managing temporary data but also it allows the SMP system to implement various performance optimization techniques. The most notable performance optimization of all is dispensing with (i.e., not following) the “write ahead logging” protocol for temporary databases. The “write ahead logging” protocol forms the basis of transactional recovery by mandating that all the changes made during a transaction be written to a stable storage before declaring a transaction as complete. By choosing different crash recovery semantics for temporary databases and thus eliminating the need of disk I/Os required for the write ahead logging protocol, a major performance differentiation is achieved for temporary databases.
In ASE, there are three kinds of temporary tables that are stored in a temporary database:
Work tables: These are created internally to store the results of intermediate data generated during a SQL statement. They are automatically destroyed upon completion of the statement.
Temporary tables (also known as #tables): This type of temporary table is created for a session's own use. It cannot be shared across session, and is automatically destroyed when a session ends.
Regular Tables This type of temporary table is typically used to share information between co-operating sessions or applications.
In SMP, each session is assigned a temporary database. Work tables and temporary tables (i.e., #tables) are always created in the assigned temporary database. However, regular tables can be created in any temporary database.
There is much interest in database systems based on SDC architecture. In SDC, for example, a collection of Sybase ASE servers jointly manages all the data on the shared disks, i.e., a single database is used and managed by all the participating ASE servers (instances). When one of the participating instances fails, the database is transactionally recovered and it continues to be available to surviving instances. However, there are significant cost overheads that are involved in maintaining the data coherency and concurrency in SDC. These include costs involved in management of distributed locking schemes, page transfers, inter instance message exchanges, and the like. These costs are justified for a database that needs to be recovered upon an instance failure. However, these costs are not justified for management of temporary data in SDC.
The existing semantics for a temporary database in SMP do not directly fit into SDC systems. Instead, management of tables in a temporary database in SDC should address the following issues:
Upon failure of an instance, work tables and temporary tables that are created by sessions on the failed instance must be destroyed. Work tables and temporary tables that are created by sessions on other instances must continue to be available.
Upon failure of an instance, regular tables that are created by sessions on the failed instance (as well other instances) must be transactionally recovered and be available to surviving instances.
The combination of part-recover/part-discard characteristics is not a usual recovery requirement in SDC. Also, this needs to be addressed without incurring the high costs that are usually involved in management of non-temporary data.
Furthermore, in a typical deployment of a SMP system, temporary databases are created on RAM-Disk (a solid state storage disk using random-access memory) to achieve even higher performance. A RAM-Disk is a virtual solid state disk that uses a segment of active computer memory, RAM, as secondary storage, a role typically fulfilled by hard drives. Access times are greatly improved, because RAM is approximately a hundred times faster than hard drives. However, the volatility of RAM means that data will be lost if power is lost, e.g., when the computer is turned off. Since RAM-Disk is not a shared storage, currently, there is no way to use them in SDC. By definition, all the disks in SDC are shared disks, i.e., they can be accessed by all the processors that form the cluster. However, customers would require that they have the ability to make use of RAM-Disk in SDC as well to get same performance benefits as in a SMP system.
Because of the advantages offered by SDC systems, there is much interest in improving performance by solving these problems, including providing a way for a RAM-Disk to be used in SDC.