There is an increasing appreciation for the need for ease of use in managing large networks and databases. This is necessary to permit maximal exploitation of the potential of networks by large and small enterprises in sharing resources, tracking inventory, accounting, and carrying out transactions among the myriad of tasks that are required of a well managed organization. The cost of upkeep and administrative support for electronic databases is a significant factor in the cost-benefit analysis undertaken by potential electronic database users. Furthermore, the possibility of expanding the reach and nature of objects managed by a single enterprise requires efficient management of a very large number of objects in a single database while requiring as little supervision and upkeep as possible.
Use of distributed computer networks to implement databases results in more responsive databases, including local updating and management in multiple master systems. Redundancy built into a distributed implementation results in a more resilient and reliable database.
A database may be thought of as constituting two essential properties, viz., a collection of data objects and a set of procedures/tools/methods for locating a particular one of the data objects. The procedures/tools/methods for locating a particular object are included in a directory service. The directory is a namespace that aids in resolution of an identifier to a corresponding object. Commercial databases typically include storage for the objects and implementations of directory services for navigating the directories.
Application programs, including database management programs, are typically written for execution on a particular platform, which may be a virtual machine. Modem platforms include a plurality of services and appropriate management strategies to allow different applications access to system resources. These services provide many of the functions that applications are expected to use and thus free the application writer from worrying about the more mundane implementation details. Not surprisingly, this is an effective strategy since having a single coherent implementation reduces complexity and enhances design of stable computing systems capable of executing different applications. An important advance in designing stable computing systems has been the development of platforms using multithreaded systems, which are described further below.
Traditional OSs for personal computers used a single threaded architecture in which programming code was executed in a serial fashion. A thread is a path of execution within a process that is recognized and provided time on the processor by the OS. Each application usually has at least one thread and, thus, is assigned time in accordance with its relative priority. It should be understood that the term thread refers to code that is provided execution time slices by the OS. This does not foreclose a developer of an application to define a path of execution within the application such that the application itself directly controls the time allocated to a particular path while the OS may be unaware of its existence. For clarity, such developer defined execution paths are referred to as “fibers” as opposed to threads.
Not surprisingly, any misstep could result in a fatal bottleneck in a single threaded system. In contrast, in a multithreaded architecture the OS exercises greater control over the execution of different tasks. The OS schedules slices of time on the processor for identified units termed threads.
An effective strategy, in addition to using threads, in the management of network resources, effectively a database, has been to automate many tasks. Many of these administrative tasks are provided by the operating system if many applications are likely to benefit from a common implementation. However, since many administrative tasks require manual intervention to address machine failures, user errors or bugs in the software. Consequently, not being routine, they often require manual intervention due to the difficulty in automating them. There is a need for stable management systems that perform well over ever longer periods of time. A database management system may be roughly understood to be a combination of database applications and the relevant operating system services.
Operation of computer software over a long period of time often results in perceptible performance degradation as a result of an accumulation of smaller defects or “bugs.” Each of the small defects, in isolation, does not cause a noticeable reduction in system performance. Some examples of such errors include resource leakage due to failure to recover all of the resources allocated in course of carrying out the various tasks. In many embodiments it may even be undesirable to track each small error, and instead it may be preferable to correct the error when it is necessary to correct a degradation in system performance or when it is relatively less expensive to make the correction.
The development of databases for managing very large collections of information presents novel problems. An important development in database design has been the use of object oriented technology resulting in representing a database as a collection of objects related by inheritance. For convenience, an object may be considered to be a collection of attributes and methods, which are also collectively termed properties of an object. An object may contain other objects and may be related to other objects by inheritance. One of the attributes that all objects in a database may be expected to have is a name.
An advantageous naming strategy treats the database as a namespace organized as a tree. This naming strategy assigns a distinguished name (“DN”) to each object in a database. The distinguished name is a complete description of the position of the object in the database. In addition, a relative distinguished name (“RDN”) may be defined that provides a path to access an object in a database from a particular node in the tree structure. Thus, DN is a name relative to the root node in a tree like database. Furthermore, in object-oriented databases the DN of an object may be searched using its attributes, e.g., by conducting a tree traversal.
“WINDOWS®” brand operating systems manufactured by the “MICROSOFT®” corporation of Redmond, Wash., use such a namespace organized into a tree structure. Advantageously, a security boundary enclosing a part or the whole of the tree defines a domain such that users in a domain need only log in at one node in order to have access to the entire domain, which may include many different networked computers. Several such domains may be related by trust relationships to form a tree of domains. Several non-contiguous trees constitute a forest and it is even possible to imagine collection of forests.
Finding resources, i.e., objects, easily is an important consideration in the design and management of networks. These resources may be printers, scanners, keyboards, workstations, data, applications or even other users. However, it is unlikely that every resource is likely to be sought to the same extent. Some resources may be unavailable due to security or other concerns and thus locating them successfully may be of limited value in any event. “Publishing” resources intended to be discovered by users results in increased efficiency since a published resource is available at suitable domain controllers as part of a directory and cross-referenced against its DN. This allows localized searches at the domain controller for discovering published resources' DN as opposed to a tree traversal spread out over the network. In a domain there may be several domain controllers that replicate changes in the directory to make them available at each controller. In a preferred embodiment, the domain controllers are related by peer relationships and introducing a change at any one domain controller results in the propagation of the change to every other domain controller in the domain, thus making searches for the DN of an object of interest possible at an easily accessible domain controller, hence faster.
In a large network the number of published resources can grow to include millions of objects and adversely impact on the management of the namespace. Thus it is of interest to remove unusable objects, i.e., orphaned objects, from the directory to ensure that the directory can remain stable over as long a period of time as possible without incurring a large overhead, resource leaks or compromising access to resources.