1. Field
Embodiments of the invention relate to correlating queries issued by applications with their source lines and analyzing applications for problem determination and where used analysis.
2. Description of the Related Art
Relational DataBase Management System (RDBMS) software may use a Structured Query Language (SQL) interface. The SQL interface has evolved into a standard language for RDBMS software and has been adopted as such by both the American National Standards Institute (ANSI) and the International Standards Organization (ISO).
An RDBMS uses relational techniques for storing and retrieving data in a relational database. Relational databases are computerized information storage and retrieval systems. Relational databases are organized into tables that consist of rows and columns of data. The rows may be called tuples or records or rows. A database typically has many tables, and each table typically has multiple rows and multiple columns.
Database applications (also referred to herein as “client applications”) in an enterprise implement business logic and interact with data stored in databases. Up to now, how the database applications interact with the databases remains in the hands of a database application developer responsible for coding the database applications (i.e., to develop database source code). In a more rigorous environment, there are models that describe both the data and the database applications. A logical model may be produced that describes the data as the business sees the data, while a physical model may be produced that describes the data as stored. Further, an application model may be produced that documents the interaction between the database application and either the logical model or the physical model. These models serve to describe how the database application interacts with the data used by the business. Developers tasked with maintaining the database applications rely on these models to understand how the database applications are impacted as the database changes. Database administrators (DBAs) also rely on these models to optimize the database based on how the data is being used. More often than not, such application models are either out of date or incomplete. This makes the task of maintaining database applications difficult.
If a change to a database is required for some database applications, such as altering some of the database objects (e.g., tables and columns) or database schemas or adding new database objects or database schemas within the database, it becomes difficult to determine which database applications are affected and to determine the cost of modifying the database applications to use the changed or new database objects or database schemas. Such roadblocks often lead to database tables that reflect the need to minimize the impact to existing database applications rather than to reflect the needs of the business. Such databases become difficult to maintain and understand as the business needs evolve.
Thus, there is a need for a better way of gathering information about running database applications to make it easier for developers to understand how the database applications make use of the database and the extent of changes to be made to the database applications to use a different database schema.
Today, database application developers and DBAs face numerous pain points when trying to isolate poorly performing queries (e.g., SQL statements) or trying to determine the queries being issued against the database for audit purposes. Finding and making the correlation between the queries and the related JAVA® source code (of a JAVA® application, which may also be referred to as a JAVA® database application) is tedious and time-consuming (JAVA is a trademark of Sun Microsystems in the United States, other countries, or both). Often the way to understand how the database application accesses the database is to gather all the queries issued by the database application. It is especially burdensome when DBAs see problematic queries issued against the database and have to get help to find the database application that issued the problematic queries.
Correlating queries executed on the database to the actual lines of code triggering the queries includes gathering and wading through stack traces from database drivers and different data access components accessed by the database application. The process is repeated every time any problem occurs in the database application. The ability to correlate depends on the underlying components to provide appropriate stack traces and is a continuous burden on developers to add stack traces and keep them correct.
The DBA is also limited in identifying what JAVA® classes were issuing the queries due to the limited information found in the stack traces. Because the developers choose JAVA Database Connectivity (JDBC®) or a JDBC®-based framework, the DBA has limited tools to help the developer know what database applications the queries are coming from (JDBC is a trademark of Sun Microsystems in the United States, other countries, or both).
The correlation gets more complex with three-tier architectures and when frameworks are used. A three tier architecture may be described as further refining a client-server architecture into three separate layers: presentation, business logic, and data storage. The three tier architecture is different from a traditional two tier model in which the business logic and presentation layers are combined into a client layer. Applications using frameworks, such as a HIBERNATE® framework (which is an Object Relational Mapping (ORM) framework for JAVA® applications) or a JAVA® Persistence API (JPA) framework, generate queries on the fly, and it is difficult for the developer to trace back a particular query (or set of queries) to the query language of the framework that generated the query, even when the JAVA® source code is available (HIBERNATE is a trademark of Red Hat, Inc. Software Foundation in the United States, other countries, or both). When the JAVA® source code is not available, it is even more difficult. Therefore, if an end user, developer, or DBA complains about a poorly performing query, it may be a large effort to try and locate that query in the originating JAVA® source code.
In addition, there is no easy way to gain insight into which database objects were referenced by which parts of a JAVA® application. Team members working on different parts of the database application do not have a way to gain insight into what queries the other parts of the database application would be issuing to the database. Developers do not have information about all the queries issued by a certain JAVA® class. In addition, on the database side, schemas are continuously changing as part of the database application development process. The inability to gain insight into how much the change would impact the database application makes such changes risky. Developers and DBAs cannot easily work together to understand the impact of such changes. Because of this, the complicated process of determining the impact of a change slows down development, resulting in delays for delivering a final product, or perhaps even resulting in the decision not to make changes because of such delays.
Thus, there is need for understanding the relationship between queries and their source code for both DBAs and developers alike.