In distributed computing systems, multiple processes on different sites (e.g., PDA, personal computer, and main frame), which may be geographically sparsely located on the globe, often access various resources (e.g., memory and network) and cooperate to achieve a specific goal. One example is data replication in a large distributed database management system. Replication is a process of sharing database objects and/or data between multiple databases in distributed computing systems. To maintain replicated database objects at multiple databases, a change to a database object and/or data at a database is shared with other databases. In this way, database objects are kept synchronized in all databases involved in the replication.
In a prior art distributed database management system a database where a change originates is called a source database 120 which is maintained by a source computer 100, and a database where a change is replicated is called a destination database 122 which is maintained by a destination computer 102. Multiple processes in the distributed database management system cooperate with one another in a pipeline fashion to ensure data of an enterprise or corporation is properly replicated in real time from one site to another site or from one computer to many different computers. At a source database 120, a capture process in source computer 100 captures corporate data of interest and manipulates the captured data in memory 110 before sending it (e.g. as a logical change record, or LCR) into a network. For more information on LCRs, please see Oracle® Streams Replication Administrator's Guide, 10g Release 1 (10.1), Part Number B10728-01 by Oracle® Corporation of Redwood Shores, Calif., and this document is hereby incorporated by reference herein in its entirety.
At a destination database 122, processes in a destination computer 102 receive the corporate data (e.g. LCRs) from the network and perform transformation in memory 112 into user-requested form then save the data to the disk. In the configuration illustrated in FIG. 1A, an intermediate computer 104 is located in the network, between source computer 100 and destination computer 102. In this configuration, the changes passing through intermediate computer 104 are not persisted to database 124 therein because database 124 does not replicate database 120. However, intermediate computer 104 has one or more processes, such as PR2 and PS2 shown in FIG. 1B that cooperate with processes of the source and destination databases, and various resources such as memory and network, to form a distributed computing system. Also as shown by Database2 and Database3 in FIG. 1B, there may be multiple intermediate databases, and as shown by process App3 an intermediate database Database3 may replicate the source database Database1.
Various examples of distributed database management systems are described in the following patents each of which is incorporated by reference herein in its entirety as background: U.S. Pat. No. 7,031,974 by Mahesh Subramaniam entitled “Replicating DDL Changes Using Streams” and U.S. Pat. No. 6,889,231 by Benny Souder et al. entitled “Asynchronous Information Sharing System.” See also the following article which is incorporated by reference herein in its entirety, entitled “Oracle® Streams for Near Real Time Asynchronous Replication” by Nimar S. Arora, Proc. VLDB Ws. Design, Implementation, and Deployment of Database Replication, 2005.
It is challenging and time-consuming to manually collect and analyze performance data from multiple processes in a distributed database management system. For example, to diagnose a performance problem in systems of the type shown in FIGS. 1A and 1B, a user may operate prior art tools provided by the database management system to query and review statistics that are available for each database individually. For examples of the types of tools available for a single database, see Oracle® Database Performance Tuning Guide, 10g Release 2 (10gR2), Part Number B14211-03, published March 2008 by Oracle® Corporation of Redwood Shores, Calif., and this document is hereby incorporated by reference herein in its entirety. Specifically, in operating such a tool, a user typically runs a SQL script in an Oracle® database to generate a report that displays statistical information related to the active session history (ASH) of selected sessions during a specified duration in that database. Alternatively, the user may view the same information, for a given database, through a graphical user interface provided by the Oracle Enterprise Manager.
Prior art tools for collecting statistics in distributed database management systems that are known to the current inventors are designed for a single database. A user typically operates the tool in each individual database, to query statistics about that particular database. The user may manually analyze such single-database statistics to identify a performance problem. However, the inventors of the current patent application note that manual analysis is labor intensive, error prone, and does not accurately account for system-level issues that arise from interactions between multiple databases. The current inventors further note that manual analysis is labor intensive and time consuming, especially in real life systems involving tens of source and destination computers interconnected by a communications network, which can be a wide area network (WAN) that connects computers in different cities. Accordingly, the current inventors believe that an automated tool is needed to perform statistics collection and analysis, for use in identifying performance problems in a distributed database management system.