1. Field of the Invention
This invention relates to new and useful improvements in methods of operating general purpose digital computing systems. More specifically, it relates to a method for utilizing data storage and communication resources in a multiprocessing, distributed data base system by dynamically replicating data under distributed system control.
2. Description of the Prior Art
A multiprocessing general purpose computing system typically includes a plurality of nodes interconnected by a communication network. Each node, in such a system, may include a data processor, a data storage device, and communication ports. A data processor may be executing, in a multiprogramming mode, under control of a plurality of operating system components, in which event the data processor may be considered a plurality of nodes. The data storage device stores a data file, the operating system and its information management components, and user application programs.
Data is information abstracted from some aspect of business important to an enterprise. The challenge is to utilize data storage and communication resources of the system so as to give end users access to the data with an availability, performance, and cost commensurate with their business needs. Access to the data must be controlled to ensure the consistency and integrity of the data. Among additional characteristics of data accesses in a distributed data processing environment are geographic and temporal affinity. The basis for distributed data structures is geographic affinity: accesses to a given data item tend to cluster geographically. A basis for the method for dynamic replication of data is temporal affinity: data items which have been accessed recently may be more likely to be accessed in the near future than data items not recently accessed. The node at which accesses for a given data item tend to cluster is called the affinity node; the affinity node for a given data item may not be known ahead of time, and it may vary with time.
Distributed data technology may be categorized according to the attributes of data location, degree of data sharing, degree to which data base management control is provided network-wide, and to the type of data access. Data location may be centralized, partitioned, or replicated. Degree of data sharing may be centralized, decentralized, or distributed. Data base management control may be user provided (distributed data) or system provided (distributed data base). Data access may be by transaction shipping, function shipping, or data shipping.
Historically, a centralized approach has been used for managing data base storage and accesses. In this approach, both data management and application processing are centralized. A single data base manager is used, and the teleprocessing network is used to connect users to the central facility. In a variation on the centralized approach, some of the processing is distributed among nodes in a network, but the data is kept centralized.
The advantages of a centralized data base approach are that (1) data base integrity can be ensured by the single data base manager; (2) all application programs can be written to a single application programming interface: application programs need not be aware of data location since all data is stored in one location; (3) many tools are available to solve the problems of administering data in the centralized environment; and (4) a single system is easier to operate, maintain and control.
Some disadvantages to the centralized approach are: (1) communication costs are high for some enterprises: application performance may be degraded due to communication delays; (2) data availability may be poor due to instability in the teleprocessing network or the central system, which may have to be mitigated by backup systems and redundant communication; and (3) the processing capabilities of a single system have already been reached by some enterprises.
Two approaches for distributing data to the nodes of a distributed data system are (1) partitioning and (2) static replication. In the partitioned data approach there is no primary copy of the data base, whereas there may be in static replication.
A partitioned data base approach divides the data base into distinct partitions, and the partitions are spread among the nodes. A given data item resides at only one node location. Each location has a data base manager which manages the data at its location. A data distribution manager takes a data request from an application program and maps it to a local request, if the data is held locally, or to a remote request if the data is held at another location.
Good data availability and access performance result in a partitioned distributed data base if the data required is held locally. Furthermore, data base integrity is facilitated since each data item is managed by a single data base manager. These results may be achieved if a good partitioning algorithm exists, is known ahead of time, and is stable.
In a partitioned data base, the system must provide a network-wide scope of recovery for programs which change data at more than one location.
Among the disadvantages of a partitioned data base system are (1) reduced availability and performance result if the partitioning algorithm does not match the data access patterns; (2) the application program may have to be aware of the data location, or at least the data partitioning algorithm, and access the data base differently, depending upon data location; (3) changing the data base partitioning algorithm is very difficult because data location is reflected in the application programs, exits, or declarations at each node; (4) existing data relocation and algorithm changes at each node must be synchronized network-wide, and therefore the partitioning algorithm may not be adjusted as needed to maintain optimum performance and availability; and (5) programs which access data items uniformly across a partitioned data base, or which must access the entire data base, will suffer poor performance and availability.
Static replication techniques for distributing data include those with and without a central node. In the former, the central location stores a primary copy of the data base, and each location has a data base manager and a copy of the data base. In typical uses of static replication, the primary data base is copied and sent to each replica location, or node, where the data then becomes available for local processing. Data modifications made at each replica location are collected for later processing against the primary data base. Between periods of application processing, local modifications are sent to the central location and applied against the primary data base. Because this technique for managing replicated data bases does nothing to prevent multiple updates, the occurrence of such must be detected during primary data base update and resolved manually; otherwise, the application must be restricted so that, somehow, multiple updates do not occur. After the primary data base has been made to conform to replica changes, new copies are sent to the replica locations, and the whole process starts over again.
The main advantage of static replication with a primary copy at a central location is high availability and good response time since all data is locally accessible. However, significant disadvantages exist, among them: (1) because the system does not prevent multiple updates, data base integrity is difficult to ensure, severly restricting the data base processing which is feasible for static replicas; (2) the system does not ensure current data for application accesses requiring such; (3) special operational procedures are required for collecting and applying replica modifications to the primary data base, which can be costly and prone to error: typically, primary data base conformation occurs in the middle of the night, and since this is when problems are most likely to be encountered, key personnel must be available; and (4) the data base may not be available during the conformation procedure: providing a large enough window for conformation is not feasible in many applications, the data transmission bandwidth may be unnecessarily large because updates and replicas are transmitted only during the narrow window between periods of operation, and if one or more of the nodes is incapacitated, then conformation may not be possible in the scheduled window.
Many variations have been described in the literature with respect to the basic techniques of static replication described above. The application can be designed so that multiple updates do not occur, or the replicas can be limited to read accesses only. The application program can collect updates itself for later transmission to the primary location, or this information can be gleaned from data base manager logs. Full replicas or only partial replicas can be formed at the replica locations. The entire replica data base or only changes to data held can be transmitted. Replicas can be continually synchronized by sending modifications made by a transaction to the various nodes and receiving acknowledgments as part of transaction termination processing. Such techniques of synchronization may solve the integrity problems of static replication, but lose much of the performance and availability benefits.
U.S. Pat. No. 4,007,450 by Haibt describes a distributed data control system where each node shares certain of its data sets in common with other nodes, there being no primary copy at a central location, but replicas are continually synchronized. Each node is operative to update any shared data set unless one of the other nodes is also seeking to update, in which event the node with the higher priority prevails. Each node stores in its memory the node location of each shared data set and the updating priority each node has with respect to each respective set of shared data. When a data set is updated at a node, all nodes having a replica are sent the update. As above, such a technique solves the integrity problems of static replication, but loses much of the performance and availability benefits.
A method for utilizing data storage and communication resources in a distributed data base system is needed which avoids the disadvantages while maintaining the advantages of the central, partitioned, and static replication techniques: (1) high availability and performance for data which is held at the node where it is accessed; (2) high availability and performance for application programs that can accept data which is possibly not the most current; (3) data location transparency, such that application programmers and end users need not be aware of data location or even of a data partitioning algorithm and the data base appears the same as in the single system, or central, approach; (4) data automatically migrates to the locations where it is accessed without the need for a partitioning algorithm; (5) good performance for programs which access the data base uniformly: a given node can be made to have a copy of every data item in the data base, but it need not be the most current; (6) no special update procedures or windows required, but rather are managed network-wide by the system; (7) data base integrity, with multiple updates prevented and application programs receiving data as current as they require; and (8) the teleprocessing network can be unstable, with the control not requiring that communication with all or any other node be available to access data held locally.