This invention relates to distributed computing systems and more particularly to a system and method for application execution of updates at a distributed database.
Distributed data processing networks having thousands of nodes, or endpoints, are known in the prior art. The nodes can be geographically dispersed and the computing environment managed in a distributed manner with a plurality of computing locations running distributed kernels services (DKS). The managed environment can be logically separated into a series of loosely connected managed regions in which each region has its own management server for managing local resources. Under such an arrangement, the management servers coordinate activities across the network and permit remote site management and operation. Further, local resources within one region can be exported for the use of other regions in a variety of manners.
FIG. 1 provides a schematic illustration of a network for implementing the present invention. A network has many endpoints, with endpoint being defined, for example, as one Network Interface Card (NIC) with one MAC address, IP Address. Among the plurality of servers, 101a-101n as illustrated, at least one of the servers, 101a in FIG. 1, which has distributed kernel services (DKS), may be designated as a control server. Each server in the network is a multi-threaded runtime process that optimally includes an object request broker (ORB) which runs continuously, separate from the operating system, and which communicates with both server and client processes via an interprocess communication facility. The system management framework, or distributed kernel services (DKS) includes a client component supported on each of the endpoint machines. The client component is a low cost, low maintenance application suite that is preferably xe2x80x9cdatalessxe2x80x9d in the sense that system management data is not cached or stored there in a persistent manner. It should be noted, however, that an endpoint may also have an ORB for remote object-oriented operations within the distributed environment.
Realistically, distributed networks can comprise millions of machines (each of which may have a plurality of endpoints) which are managed by thousands of control machines. The so-called control machines run Internet Protocol (IP) Driver Discovery/Monitor Scanners which poll the endpoints and gather and store status data, which is then made available to other machines and applications. A detailed discussion of distributed network services can be found in co-pending patent application, Ser. No. 09/738,307, filed Dec. 15, 2000, entitled xe2x80x9cMETHOD AND SYSTEM FOR MANAGEMENT OF RESOURCE LEASES IN AN APPLICATION FRAMEWORK SYSTEMxe2x80x9d, the teachings of which are herein incorporated by reference. Data storage and data sharing in a large-scale distributed network require that multiple applications have access to data which may be stored remotely and which may be replicated at multiple locations throughout the network. Furthermore, it is typical that more than one application will have write access to that data. Synchronizing the data to assure that replicated data is consistent is particularly challenging in the highly distributed environment.
Under the prior art, when writing to two databases, DB1 and DB2, in an effort to synchronize the data (i.e., performing a distributed transaction), the process would be as follows: the writer of the data would first format a prepared statement using the DB1 schema, populate the statement with data using the DB1 schema, and then write the data to DB1; next, a prepared statement would be formatted using the DB2 schema, the statement would be populated with DB2 data, and the data would then be written to DB2; finally, the system would wait for DB1 and DB2 to return xe2x80x9cOKxe2x80x9d messages, followed by committing DB1 and then committing DB2. Assuming that drivers at each machine had been modified to implement the foregoing, and that a third xe2x80x9cpartyxe2x80x9d was available to watch the transaction from start to finish to review the transaction, the process flow could be executed; however, when there are a large number of queued transaction per API, the possibility for failure will greatly slow the network.
Alternative solutions include having a master copy of the data, to which a local copy is synchronized. However, the foregoing requires more storage, and necessitates that a master database control the information, which is antithetical to the peer relationship needed for a fast, efficient highly distributed environment.
In a hierarchically-arranged network, transactions may be communicated xe2x80x9cdown the linexe2x80x9d whereby an update can be implemented as a configuration change since the object request broker (ORB) for each machine will inherit the update from the ORB above in the hierarchy. However, such a method cannot be scaled to a loosely-coupled distributed system having thousands of machines wherein an application on any one of the machines has write access to data at another node. In addition, typical prior art update approaches utilize locking to block all other read and write access to data while an update transaction is executing. The prospect of locking out thousands of machines is not realistic in the distributed environment.
It is desirable, therefore, and an object of the present invention, to provide a system and method whereby an application can execute a database update without blocking access to the database.
It is another object of the present invention to provide a system and method whereby an application defines the granularity of access to a node and/or data stored at that node.
It is a further object of the present invention to provide a system and method whereby an application can implement the access and update status at the configuration system layer.
Still another object of the present invention is to provide a system and method whereby database update status can be communicated to another application seeking access to the database.
Yet another object of the present invention is to provide a system and method whereby application updates to data can be communicated on a per node or per data item basis.
It is also an object of the present invention to provide a system and method whereby a user can view a display of the update status of data in a distributed network.
The foregoing and other objectives are realized by the present invention which provides a system and method for implementing distributed transactions using configuration data that is available to all applications which may wish to access the data. Added to the configuration data is at least one status indicator to allow applications to ascertain the status of updates without performing a database-specific distributed transaction. An application which is preparing to write/update stored information must first change the at least one status indicator associated with the underlying storage. Thereafter, any other application which has a need to read or write the stored information will readily ascertain the status of the stored information from the configuration data. The other application which has need of the data may choose to read the old and/or partially updated data or may wait until the update has been completed and the at least one status indicator has been changed to indicate completion of the update. Status indicators may be associated to entire nodes at which data is stored or to specific pieces (e.g., keys) of the stored data. Furthermore, the status indicator of the configuration data can be displayed to a user.