In recent years, an increasing number of e-commerce providers and business enterprises have come to rely on middleware and application server technology as the lifeblood of their business. For example, application servers form a proven foundation for supporting e-commerce applications, providing the presentation, business and information-access logic, security and management services, and underlying infrastructure needed for highly scalable and mission-critical software applications. These servers manage all of the underlying complexities of a company's applications, allowing the organization to focus instead on delivering new and innovative products and services.
With the rising use and pervasiveness of such middleware systems, it has become important for business enterprises to diagnose and resolve various errors, misbehaviors and other problems that may occur in this field. For example, a middleware system, such as an application server, typically uses multiple components and resources working together to service an incoming request. While serving a request, these systems may face performance problems in one or more components/services. For example, a request can be serviced by the co-working of Servlets, Enterprise Java Beans (EJBs) and data sources. The performance problem of such request can be due to non-availability of an EJB instance, non-availability of a JDBC connection and the like.
The performance of such middleware systems can be adjudged by evaluating performance metrics/indicators that are usually defined in terms of response times, throughputs or load on hardware such as central processing unit (CPU), Memory, Disk IO, etc. These metrics not only indicate the current state of the performance of the middleware system, but also depend on the number of users; size of the requests and amount of data processed and are limited by hardware such as CPU type, disk size, disk speed, and memory. Similarly, the containers within an application server out-of-box expose certain performance metrics, which indicate the current state of the underlying system. Such runtime data from the containers may include metrics such as response time, total load passing through each component, errors, etc.
An application that diagnoses performance problems in these middleware runtime environments analyzes from various datasets that are exposed by these components/services for problem analysis. To be able to find the component/service that is contributing to a performance problem such as slow response, these applications often need to collect data from dozens of sources like MBeans, Server logs, diagnostics framework provided by such middleware system and the like. For example, to diagnose a slow response of a request that involves accessing an EJB, the diagnostics application may need data about that particular EJB including response time metrics, EJB pool size from MBeans, and details about exceptions from server logs.
Thus, a typical diagnostics application in middleware management space polls different data sources at preconfigured frequency and obtains the data. Polling large numbers of data sources at high frequencies can produce a large amount of data and can cause strain on the network by the amount of data being transmitted. On the other hand, polling at lower frequencies may reduce the amount of data, however lower frequencies may not be enough to properly identify the problem. As such, it is desirable to reduce the amount of data being collected by diagnosis tools while still maintaining a level of accuracy to diagnose performance problems.