The present invention relates to problem discovery and, more particularly, to problem discovery and capacity planning of database applications.
As computer systems become more complex, they become more difficult to manage efficiently, and the problems that occur within them become more difficult to isolate. One form of complex computer system on which society increasingly relies is the database system. Conventional database systems consist of one or more clients (xe2x80x9cdatabase applicationsxe2x80x9d) and a server (a xe2x80x9cdatabase serverxe2x80x9d). When a client requires data, the client submits a query to the server, where the query includes criteria for selecting the data. The server retrieves the data that satisfies the specified criteria from a database and returns copies of the selected data to the client that submitted the query. More complex database systems may include numerous servers that share access to one or more databases, where each such server may be serving thousands of clients.
Another type of complex computer system is known as an application server system. An application server system typically consists of one or more clients (xe2x80x9cbrowsersxe2x80x9d), a server (xe2x80x9capplication serverxe2x80x9d), and applications (xe2x80x9ccartridgesxe2x80x9d). Users of the browsers execute the cartridges by causing the browsers to send and receive messages through the application server to the cartridges. FIG. 1 is a block diagram of an exemplary application server system 100. The system 100 includes a plurality of browsers 102, 104 and 106 that communicate with a plurality of listeners 110, 116 and 122 over the Internet 108 according to the HTTP protocol. In response to requests from the browsers, the listeners cause a web application server 180 to invoke software modules, referred to herein as cartridges. In the illustrated embodiment, web application server 180 has initiated the execution of three cartridges 130, 134 and 138.
The web application server 180 is composed of numerous components, including transport adapters 112, 118 and 124, dispatchers 114, 120 and 126, an authentication server 152, a virtual path manager 150, a resource manager 154, a configuration provider 156 and a plurality of cartridge execution engines 128, 132 and 136. A typical operation within system 100 generally includes the following stages:
A browser transmits a request over the Internet 108.
A listener receives the request and passes it through a transport adapter to a dispatcher.
The dispatcher communicates with the virtual path manager 150 to determine the appropriate cartridge to handle the request.
At this point the dispatcher does one of two things. If the dispatcher knows about an unused instance for that cartridge, the dispatcher sends the request to that instance. If there are no unused cartridge instances for that cartridge, the dispatcher asks the resource manager 154 to create a new cartridge instance. After the instance starts up successfully, the cartridge notifies the resource manager of its existence. The resource manager 154 then notifies the dispatcher of the new instance. The dispatcher creates a revised request based on the browser request and sends the revised request to the new instance.
The cartridge instance handles the revised request and sends a response to the dispatcher.
The dispatcher passes the response back through the listener to the client.
Application server systems and database systems can be combined. For example, some or all of the cartridges that are operated through the browsers in an application server system may in fact be database applications that, in turn, issue queries to one or more database servers in response to messages from the browsers. Due to the complexity of such combined systems, it is exceedingly difficult to identify the cause of performance problems. For example, assume that a browser receives an extremely slow response to a message that is sent to an application server and dispatched to a cartridge, where the message causes the cartridge to issue a query to a database server, where the database server executes the query to retrieve data for the response. Under these conditions, the slow response time may be due to problems with any of the entities involved, or with communication problems between the entities.
The identification of unacceptably slow response times may be of interest to users as well as to the administrators responsible for managing the computer system. For example, the subscription agreement of a user may guarantee a particular level of performance (e.g. that 98% of all orders be processed in less than one minute). Users with such subscriptions would typically be interested to know when, and how often, the system is not meeting the specified level of performance.
The process of ensuring that the system is able to meet the performance requirements of users is generally referred to as capacity planning. Typically, there are six general phases in the capacity planning process:
(1) setting up the service level objectives;
(2) estimating the demand for the resources of the system;
(3) identifying resources that satisfy the estimated demand;
(4) implementing the system with the identified resources;
(5) determining whether the system actually satisfies the demand; and
(6) repeating steps (2) to (5) when the system fails to satisfy the demand.
The step of determining whether the system satisfies the demand may be accomplished, for example, by periodically analyzing statistical information relating to the system. However, the amount of processing that such analysis may require can be so enormous that, if performed at a reasonable frequency, the analysis overhead itself may result in a violation of the service level commitments made to users.
The better the tools that are made available to the capacity planner, the higher the likelihood that the implemented system will satisfy the anticipated demands. Further, when the implemented system is not satisfying the anticipated demands, the easier it will be to determine and fix the problems that are preventing the achievement of the desired performance levels.
A number of systems have been developed for problem identification and planning. For example, systems for problem determination in performance management are described in B. Arinze, M. Igbaria, and L. F. Young: xe2x80x9cA Knowledge Based Decision Support System for Computer Performance Management,xe2x80x9d Decision Support Systems 8, 501-515, 1992 and Bernard Domanski: xe2x80x9cA PROLOG-based Expert System for Tuning MVS/XA,xe2x80x9d Proceedings of the Computer Measurement Group, 160-166, 1987. A system for process control is described in D. R. Irwin: xe2x80x9cMonitoring the Performance of Commercial T1-rate Transmission Service,xe2x80x9d IBM Journal of Research and Development, 805-814, 1991. A system for planning cooking recipes is described in Janet Kolodner: xe2x80x9cCase-Based Reasoning,xe2x80x9d Morgan Kaufmann Publishers, Inc., 1993. Systems for problem identification of electrical circuits and analysis of financial statements are described in Robert Milne: xe2x80x9cUsing AI in the Testing of Printed Circuit Boardsxe2x80x9d National Aerospace and Electronics Conference, Dayton Ohio, May 1980, and Donald W. Kosy and Ben P. Wise: xe2x80x9cSelf-Explanatory Financial Planning Models,xe2x80x9d Proceedings of the National Conference on Artificial Intelligence, 176-181, 1984. However, none of these systems address the domain of systems management, nor do they consider problem discovery and capacity planning.
An attempt to apply multidimensional database technology to systems management, which focuses on performance management for data from a single source, is described in Robert F. Berry and Joseph L. Hellerstein: xe2x80x9cAn Flexible and Scalable Approach to Navigating Measurement Data in Performance Management Applications,xe2x80x9d Proceedings of the Second IEEE International Conference on Systems Management, June, 1996. Another attempt to use multidimensional navigation for sales/subscription handling is described in Business Objects: A. M. Burgeat and F. Prabel, xe2x80x9cData Warehousing: Delivering Decision Support to the Many,xe2x80x9d Business Objects Corporation, 1996.
Based on the foregoing, it is clearly desirable to provide techniques that allow problems within complex computer systems to be isolated, and to assist in planning such systems to comply with user requirements, and to maximize system capacity and avoid bottlenecks.
Techniques are provided for facilitating database application management and, more specifically, for facilitating problem discovery and planning where the data being collected is from a plurality of data sources with a wide variety in type of instrumentation and a variety of data access components. In certain embodiments, the data access components include ROLAP/MOLAP clients. Use of the techniques described herein may reduce maintenance costs for database application systems, either in a conventional deployment scenario, or in a data center hosting scenario.
According to one aspect of the invention, techniques are provided for monitoring performance of an application server system, where the techniques involve collecting performance data from components of the application server system and storing the performance data as multidimensional data organized according to a plurality of dimensions within one or more database systems. One dimension of the plurality of dimensions is a hierarchical time dimension. Another dimension of the plurality of dimensions is a component dimension. An interface is presented for accessing and navigating through the multidimensional data within at least one of the one or more database systems. Drill down operations may be performed into one or more hierarchical dimensions of the plurality of dimensions in response to user input received through the interface.