1. Field of the Invention
This invention relates to enabling a client system that is networked into a sysplex environment via a network such as TCP/IP to locate a specific server within the sysplex environment, and more specifically, for enabling a client system to complete a two-phase commit process with a same database management system (DBMS) that moved to a different network address before a client transaction was completed.
2. Description of the Related Art
The term "sysplex" is used generally herein to describe a group of computer systems which has parallel processing capability. More specifically the term "sysplex" is used herein to describe a group of computer systems that make up a parallel database management system (DBMS). Most database management systems (DBMS) on the market today use some form of parallelism to address high-volume transaction workloads.
FIG. 1 illustrates a sysplex environment 100 of three computers 101, 102, 103 sharing disk space such as a pool 110 of disk drives 111-114 where the database resides. FIG. 1 is illustrative of systems having a "shared-disk" architecture, i.e., where multiple computer systems in the sysplex share a common pool of disk devices. Other systems have a "sharenothing" architecture, where each of the computers in the sysplex own a subset of the data managed by the parallel DBMS sysplex. In either architecture, each system 101, 102, 103 has its own physical copy of a database management system product 121, 122, 123. Also, in both architectures, each system 101, 102, 103 has a separate log dataset 151, 152, 153, respectively, for managing the commit or roll back of a unit of work. This separate log dataset can only be accessed by the DBMS that owns it. All of the DBMSs 121-123 know how to communicate back and forth to each other, and they know how to manage the pool of data 110 that is common to them. An example of a sysplex environment is an IBM parallel scalable sysplex such as the sysplex capable CMOS 390 systems which have a sysplex timer, a coupling facility, and fiber optic communication links.
A client 131 is connected via a network 135 to the sysplex 100. The client could be another parallel sysplex or a workstation (such as one running an OS/2 or UNIX operating system) or other personal computer. The client 131 views the sysplex 100 as one image.
The client 131 communicates with one member, i.e., a DBMS server, of the sysplex to do work. The client has a log dataset 132, but may not have a database. During a two-phase commit process, as the client does work, the client records information in the log dataset. The DBMS server 121 that the client is communicating with in performing the work also has a dataset 151 to record the DBMS server's information. The DBMS writes log records to a log dataset describing changes to the status of the client's unit of work. Such information may include the statements that were performed in the unit of work, undo and redo records for the rows that were changed, the outcome of the work, i.e., committed or rolled back, etc. Only one member of the DBMS sysplex has read/write access to the log dataset containing the records for the client's unit of work.
Problems arise when client systems establish a connection to a server sysplex, such as a DBMS server sysplex, using TCP/IP, especially when a two-phase commit procedure is required. For a network 135 such as TCP/IP, the network routing is accomplished with two values, the IP address and the TCP/IP port number, i.e., the socket address. The IP address identifies the hardware network adapter that is used to connect the DBMS server to the network. This may be a channel address or a 3172 control unit that a token ring is plugged into. When a DBMS product moves from one system to another, or from one control unit to another within the same system, its IP address changes. This invalidates the network routing information that the client had previously used.
The port number identifies a server product, such as a DBMS. TCP/IP routes messages to each DBMS server using the TCP/IP port number, i.e., socket number, of the DBMS server. Generally, TCP/IP servers are configured so that all instances of a given server have the same TCP/IP port number. This port number is usually called a "well-known" port. For example "446" is a well-known port. All RDBMSs that adhere to the Distributed Relational Database Architecture (DRDA) will always try to use this port. It is a predefined port for SQL databases. (Other file transfer programs and TCP/IP standard applications have their own predefined ports.) If multiple members of the DBMS sysplex are restarted on a single computer system, only one member can own the well-known port at any point in time. Clients are not able to connect to the other DBMS sysplex members on that computer system using the well-known port.
In order for a parallel sysplex to operate seamlessly as a single system image to the clients, every DBMS server must have the same port number. This assumes that all of the DBMS which answer to a same port number are equivalent in terms of function. A problem arises because the DBMS servers are not equivalent, and are not interchangeable with each other, when communicating with a client during a two phase commit procedure (unless the systems have peer recovery capability which is discussed below). If contact is lost during a communication session, the client must talk to the same DBMS server that the client had just lost contact with because it is that DBMS server that owns the log dataset that has the record of information as to the status of the in-progress unit of work.
When a communication failure occurs during the two-phase commit process, the client must "resynchronize" with the member of the sysplex that owns the log records associated with the client's unit of work. The resynchronization process allows the client to determine the outcome (success or failure) of the unit of work at the DBMS server. In order to perform resynchronization, the client must re-establish communications with the member of the DBMS sysplex that performed the original unit of work. It may be difficult for the client to connect to the correct member of the sysplex for several reasons. First, the required member of the DBMS sysplex may not be active when the client attempts resynchronization. Second, the required member of the DBMS sysplex might have moved from one computer system to another. This is often done to help balance computer resources, or it can occur when the sysplex recovers from a failure of one of the computers in the sysplex.
Previously, servers (such as a DBMS) could not move to another system. If the server went down, the client just waited for the server to come back up. Now, servers are able to move to another system. This movement is necessary if a machine that a server is running on crashes and another machine is capable of handling the workload of the machine that crashed. Allowing a DBMS to move to another machine enhances workload balancing and data availability. However, when a DBMS member moves to restart on another machine, the IP address of the DBMS member will change. Also, a given machine may have a number of control units connected to it to provide network access to the machine, and each control unit has a different IP address. If a control unit crashes, the DBMS server may be able to be addressed through another control unit having a different IP address on the same machine. Therefore, if a different controller within the same machine is used, the IP address of the DBMS member will change, also. Presently, the client would have no knowledge of the new IP address, and therefore could not continue to communicate with the same DBMS that had moved. The movement of a member of the DBMS sysplex to a different computer, or through a different control unit, and the changing of the RDBMS member's TCP/IP network address prevents clients from performing resynchronization, since the clients would ordinarily use the member's TCP/IP address to establish network connectivity.
One alternative approach is to support peer recovery for DBMSs in the sysplex environment. A DBMS would route the resynchronization request to the DBMS member which performs the peer recovery for the failed member. However, peer recovery is difficult to implement. There are timing problems that can occur when multiple DBMSs try to access the failed DBMS's log data. A substantial amount of program code is needed to serialize access to that log data. The serialization could become a performance bottleneck.
The above described problem is unique to networks such as TCP/IP, NETBIOS and other networks (herein defined as non-solution networks) that do not provide their own network solution.
A network such as SNA provides its own network solution to the above stated problem. For example, VTAM LU 6.2 is communication software that allows systems, such as in a sysplex environment, to communicate between each other. The network management product VTAM LU 6.2 runs in a layer above the DBMS product. With the SNA network protocol managed by VTAM, each DBMS member is uniquely identified via a LU name. The same LU name is used even if a DBMS fails and restarts on a different computer system. When the DBMS moves, the network name moves with it. As such, the network address of the DBMS does not change. Because the LU name is associated with the DBMS and the LU name moves with the DBMS when the DBMS moves from one system to another, it is possible for the client system to use the DBMS LU name for network routing, regardless of which system houses the DBMS.
However, not all networks are SNA networks. Therefore, an approach is needed for those networks such as TCP/IP and NETBIOS that do not provide a network solution for the above problem. However, any such approach should be less difficult to implement than peer recovery, and should be one which does not suffer from performance bottleneck problems. Also, it is desirable that such an approach preserve the ability for a client to access the sysplex seamlessly while still being able to resolve the indoubt unit of work with a same DBMS that may have moved to another IP address.