A data model provides the general structure of a database. A data model can be viewed as a collection of conceptual tools for describing data, data relationships, data semantics, and consistency constraints. It is often useful to model data in a hierarchical structure. In a hierarchical model, data and relationships among data are represented by records and links, respectively. Hierarchically modeled data is logically structured as a “tree”, which includes a conglomeration of parent-child relationships between data within the hierarchical structure. LDAP (Lightweight Directory Access Protocol), Microsoft® Windows® registry, and Oracle® Cluster Registry are examples of uses of hierarchically modeled or structured information.
FIG. 1 is a diagram graphically illustrating hierarchically structured, or related, data. Structuring data hierarchically provides some benefits over other data structures. It is easier to explore hierarchically structured data than other data structures, due to the inherent semantic benefits of the hierarchical structure. For example, one can intuitively traverse a hierarchy to locate a specific data item of interest.
Hierarchically organized data is often represented by a key name-value (or name-value) pair. More specifically, each item of information is identified within the hierarchy by a key name consisting of key components. The term “key name” is generally used herein to refer to a specific value within the key domain associated with the hierarchy. For example, a key domain may be network IP addresses and an associated key name may be 255.255.000.000. For another example, a domain may be the collection of URLs associated with the public Internet, and an associated key of the domain may be a specific URL associated with a specific web page.
For example, a file's location in a hierarchical directory may be identified as: C:\My Documents\example.doc, wherein each backslash separates levels of the associated hierarchy. More generally, information may be identified by a key name represented as a character string, such as a.b.d.e, where key component e is a child (i.e., a descendant) of key component d, key component d is a child of key component b, and so on, to the hierarchy root. In some contexts, hierarchically organized information contains name-value pairs. A value is information associated with a name. For example, in the foregoing hierarchical directory path, “My Documents\example.doc” is a name of a name-value pair. The content of this file is the value of the name-value pair.
FIG. 1 illustrates that a hierarchical structure has levels associated with the structure, and thus, the key name. That is, each key name has as many levels as the number of key components in the key name. For example, items x and a are one level from the root, so they are considered at Level 1; items z, c, and d are three levels from the root and are considered at Level 3.
A distributed system may include multiple logical computational units. Each logical computational unit may be referred to as a “member” of the distributed system. Thus, a member may be a network-connected personal computer, workstation, central processing unit (CPU), a type of computing node, or other logical computational unit such as an instance of a program. Members of a distributed system can communicate with each other.
A “group” includes a subset of members within a distributed system. A group can have a minimum of one member and a maximum of all members in a distributed system. In an embodiment, a group may be created or modified to have zero present members.
Multiple members of a distributed system may share a common entity. One such shared entity is a shared storage device or file system. All of the members in the distributed system may directly access the shared storage device or file system (e.g., through a network). All of the members in the distributed system can see a given file within the shared storage device or file system. All of the members within the distributed system may access the contents of the given file. In other words, the data that are contained in the file would be “public” or “global” to all members and not “private” to any particular member.
It is common for one member of a distributed system to be one instance of a particular computer program, and for another member of the distributed system to be another instance of the same computer program. Each member of a distributed system may be a different instance of the computer program. Each instance may be able to communicate with each other instance. The particular computer program may be, for example, a database server. The particular computer program may execute, for example, in the form of multiple concurrent processes.
All instances of the particular computer program may attempt to read or write to the same particular location and/or file (such as a configuration file) in the shared file system. The particular computer program might not have been designed to operate on a shared file system. Therefore, by overwriting a value in the particular file, one instance of the particular program may cause the other instance of the computer program to subsequently read an incorrect value. Such data conflicts constitute a problem.
This problem particularly impacts registries, e.g., shared registries. A registry is a data repository that stores configuration information. A single registry may store configuration information that are stored on a single shared storage subsystem. A registry may organize data according to the hierarchical data model described above. Configuration information for multiple databases may be hierarchically organized beneath a single root.
One example of a multi-instance database server is RAC, which can have multiple instances of the same program running as multiple members of a distributed system. To communicate with each other, each of these instances listen to specific ports. For example, a key like “RAC.instancename” could be created in a shared (between all the instances of RAC) configuration repository such as a registry. Each instance of RAC stores its name as a value of the key “RAC.instancename”. Consequently, every time, after the first time, that an instance of RAC stores its name as a value of this key, the previous value of this key will be overwritten. This results in a data conflict.
Many applications require that the multiple members of a distributed system (such as instances of the same computer program) refer to a piece of data in the global repository by the same name but read or write a value that is specific for each instance.
Computer programs, such as database servers, may be modified to access separate instances of files (or registry entries) in a shared file system (or registry). However, such modification is impossible when source code is unavailable. Even when source code is available, source code modification is time-consuming and expensive. Often times it is not desirable.
These problems may be further exasperated based upon the type/category of data that is being managed and/or accessed. Some example categories of data include private data and shared data. Private data includes data that are only associated or accessible by a designated group of one or more members. Shared data may be accessed by multiple different members or groups of members. Subcategories of shared data include static shared data and dynamic shared data. Static shared data corresponds to shared data that are accessed by multiple different members or groups of members using the same key, but in which it is contemplated that the data will either never change or will change only under sparse circumstances. Therefore, it is likely that once this type of data is set to a particular value, it will not normally change to another value. An example of this type of data is application or system metadata, e.g., node identification numbers. Dynamic shared data corresponds a type of shared data in which it is contemplated that the data may possibly change on a more frequent or regular basis. An example of this type of data is program state data for a running application that is constantly being updated and persisted to disk. It can be a challenging task to implement a mechanism to efficiently manage shared data, particularly when the shared data includes dynamic shared data that may change over time, rather than only static data that may never or only infrequently change in value.
Embodiments of the present invention provide a method, mechanism, and computer usable medium for implementing group dependent keys (GDKs) in a computing system. According to one embodiment, a GDK is visible to all members of a distributed system, but its value(s) and subtree(s) could be different for different groups. Members of each group see the same view of the value and subtree of a GDK. Additional embodiments of the present invention provide a method, mechanism, and computer usable medium for implementing group dependent links (GDLs) in a computing system. According to one embodiment, a data transformation function (DTF) is used to coordinate changes to different versions of dynamic shared data.
Further details of aspects, objects, and advantages of the invention are described below in the detailed description, drawings, and claims.