1. Field of the Invention
The present invention relates generally to data processing environments and, more particularly, to a system providing methodology for data replication tracing.
2. Background Art
Computers are very powerful tools for storing and providing access to vast amounts of information. Computer databases are a common mechanism for storing information on computer systems while providing easy access to users. A typical database is an organized collection of related information stored as “records” having “fields” of information. As an example, a database of employees may have a record for each employee where each record contains fields designating specifics about the employee, such as name, home address, salary, and the like.
Between the actual physical database itself (i.e., the data actually stored on a storage device) and the users of the system, a database management system or DBMS is typically provided as a software cushion or layer. In essence, the DBMS shields the database user from knowing or even caring about the underlying hardware-level details. Typically, all requests from users for access to the data are processed by the DBMS. For example, information may be added or removed from data files, information retrieved from or updated in such files, and so forth, all without user knowledge of the underlying system implementation. In this manner, the DBMS provides users with a conceptual view of the database that is removed from the hardware level. The general construction and operation of database management systems is well known in the art. See e.g., Date, C., “An Introduction to Database Systems, Seventh Edition”, Addison Wesley, 2000.
Increasingly, businesses run mission-critical systems which store information on database management systems. Each day more and more users base their business operations on mission-critical systems which store information on server-based database systems, such as Sybase® Adaptive Server® Enterprise (ASE) (available from Sybase, Inc. of Dublin, Calif.). As a result, the operations of the business are dependent upon the availability of data stored in their databases. Because of the mission-critical nature of these systems, users of these systems need to protect themselves against loss of the data due to software or hardware problems, disasters such as floods, earthquakes, or electrical power loss, or temporary unavailability of systems resulting from the need to perform system maintenance.
One well-known approach that is used to guard against loss of critical business data maintained in a given database (the “primary database”) is to maintain one or more standby or replicate databases. A replicate database is a duplicate or mirror copy of the primary database (or a subset of the primary database) that is maintained either locally at the same site as the primary database, or remotely at a different location than the primary database. The availability of a replicate copy of the primary database enables a user (e.g., a corporation or other business) to work with a copy of the database in the event of the loss, destruction, or unavailability of the primary database.
Replicate database(s) are also used to facilitate access and use of data maintained in the primary database (e.g., for decision support and other such purposes). For instance, a primary database may support a sales application and contain information regarding a company's sales transactions with its customers. The company may replicate data from the primary database to one or more replicate databases to enable users to analyze and use this data for other purposes (e.g., decision support purposes) without interfering with or increasing the workload on the primary database. The data that is replicated (or copied) to a replicate database may include all of the data of the primary database such that the replicate database is a minor image of the primary database. Alternatively, only a subset of the data may be replicated to a given replicate database (e.g., because only a subset of the data is of interest in a particular application).
In recent years, the use of replication technologies has been increasing as users have discovered new ways of using copies of all sorts of data. Various different types of systems, ranging from electronic mail systems and document management systems to data warehouse and decision support systems, rely on replication technologies for providing broader access to data. Over the years, database replication technologies have also become available in vendor products ranging from simple desktop replication (e.g., between two personal computers) to high-capacity, multi-site backup systems.
Knowledge of how data flows through an enterprise is very important, yet increasingly difficult in today's complex environments, particularly as these modern enterprises often contain data being replicated to and from hundreds of servers and databases with thousands of tables and stored procedures. The size and complexity of these environments, plus an ever-changing workforce, can result in situations where the details of data flow through the environment are not immediately known. These details can be difficult and time consuming to track down by current methods of examining the configuration of the replication environment, thus making the maintenance of a comprehensive knowledge of the enterprise's data flow next to impossible.
Accordingly, a need exists for a manner of tracing replication to provide a clear and accurate portrait of how data is flowing through an enterprise's systems. The present invention addresses such a need.